Open takahirox opened 3 years ago
I'm trying https://github.com/tensorflow/tfjs-models/tree/master/handpose
30fps at my work Windows laptop and 15fps at my personal Windows laptop. Probably we need to run it in Workers.
It seems Handpose currently detects up to one hand, doesn't two or more hands. But one hand would be good start.
Hey there! I've been thinking about how to implement this for some time and would love to help. The latest Mediapipe supports multiple hands with only 2D landmarks but the last version had 3D so I think they will update it soon. In any case, I think it would make the most sense to implement it or the Tensorflow.js handpose model which has 3D landmarks but only for one hand. Fortunately the keypoints for both are the same (apart from missing Z) so the emulation will be the same.
Here's a graphic I made contrasting them:
The hand models that Aframe uses are located here: https://cdn.aframe.io/controllers/oculus-hands/unity/left.glb, https://cdn.aframe.io/controllers/oculus-hands/unity/right.glb so we could potentially use them instead of controllers in DevTools.
I think we would start emulating XRHands in EmulatedXRDevices instead of a gamepadInputSource but I could be wrong. Is the EmulatedXRDevices the main file I need to worry about?
Thanks, Alex
Thanks for the comment.
I think we would start emulating XRHands in EmulatedXRDevices instead of a gamepadInputSource but I could be wrong. Is the EmulatedXRDevices the main file I need to worry about?
To figure out the change, I'm trying to make an easy prototype now. Please give me some time.
Sounds good. It looks like XRHand isn't implemented in the polyfill also so I've been looking into that.
And we are discussing how to capture webcam in the extension in #262. If you are interested in, please join us there.
Progress, the prototype works now https://twitter.com/superhoge/status/1349560469837672452
Testing your excellent Hand Tracking Implementation. However I'm unable to get the joint pose. Anything wrong with the following code?
var ds = xr.inputSources;
if (ds)
{
ds.forEach( p => {
if (p.hand instanceof XRHand && p.handedness == 'right')
{
var joints = p.hand.joints;
for (var key in boneMap)
{
var jointSpace = joints.get(key);
var pose = xr.frame.getJointPose( jointSpace, xr.refSpace );
if (pose)
{
console.log(pose); // pose is always null even when right hand in camera.
}
}
}
});
}
I haven't written the document yet but first please check how to use the hand input from the following video.
https://twitter.com/superhoge/status/1356083004754468864
And then try the Three.js Hand input example as I use for the test
https://threejs.org/examples/#webxr_vr_handinput
Let me know if the hand input support still doesn't work.
I checked out the latest version from branch HandWIP.
When I ran three.js hand tracking demo, the developer console says: "The optional feature 'bounded-floor' is not supported' and no hand model shows up.
When I ran my own test app. It says: "The optional feature 'hand-tracking' is not supported"?
Sorry to trouble you again, but what's the proper features needed be requested for it to work?
var opt = {sessionType: 'immersive-vr',
referenceSpaceType: 'local',
framebufferScale: 1.0,
depthNear: c.pref.near,
depthFar: c.pref.far,
optionalFeatures: ['hand-tracking']
};
If I uncheck Stereo Effect, both of these Errors went away and red wires show up in PIP. But still, I am getting Null for pose and no joints show up in Three.js example. Is there a specific version of Chrome that I need to be using? Thank you.
Stepped into your code. EmulatedXRDevice is constructed on page load and hand-tracking is enabled correctly..... But for some reason, _baseMatrix and _inverseBaseMatrix are not populated. and hence getJointPose always returns null.
Looks like I need oculus browser for it to work... the only browser with hand tracking build in.
Would you find the following line in webxr-polyfill.js
and set break point?
this.handGamepadInputSources[handIndex].inputSource.hand.get(jointName)._baseMatrix = m;
That line gets hit per frame. and _baseMatrix is populated at that point.
_updateHandPose(matrixArray, handIndex, jointName) {
if (!this.hasHandControllers ||
handIndex >= this.handGamepadInputSources.length ||
!XRHandJoint[jointName]) {
return;
}
const m = create$6();
for (let i = 0; i < 16; i++) {
m[i] = matrixArray[i];
}
this.handGamepadInputSources[handIndex].inputSource.hand.get(jointName)._baseMatrix = m; <- gets updated here
}
But somehow when I call getJointPose, _baseMatrix and _inverseBaseMatrix are both null. Almost like they are from different memory space. Updated to one copy and read from another.
I put a break point where this[Private].baseMatrix = null, but it never happened between read and write. STRANGE.
Also notice XRHandJoint['wrist'] is 0 therefore the condition !XRHandJoint[jointName] will not allow wrist to be updated.
Is it possible that I'm calling XRFrame.getJointPose from within requestAnimationFrame's render function. And at that time, XRFrame provides a snapshot of current state of inputDevices that never gets updated by a separated webxr-hand-pose event handler? They are hardly in sync since XRFrame is only valid during the render call? Perhaps I need to call _updateHandPose from requestAnimationFrame's render function instead of being event-driven?
Also notice XRHandJoint['wrist'] is 0 therefore the condition !XRHandJoint[jointName] will not allow wrist to be updated.
Good catch. I fixed it.
But somehow when I call getJointPose, _baseMatrix and _inverseBaseMatrix are both null. Almost like they are from different memory space. Updated to one copy and read from another.
I put a break point where this[Private].baseMatrix = null, but it never happened between read and write. STRANGE.
Hm, that's weird. And I can't reproduce the problem here...
Are you running XR in immersive-vr mode? Your demo video doesn't show double screen.
Yes, but unchecked Stereo effect.
Looks like Mediapipe is not providing z-depth at this point, which is needed for animating a 3D hand.
It's a known limitation and should be unrelated to the null pose issue you encountered.
Give me some time to make you a test case and you can check if something wrong with my code or like I said, something to do with how requestAnimationFrame is using render function to pass in a snapshot of XRFrame and inputSources.
An XRFrame represents a snapshot of the state of all of the tracked objects for an XRSession. Applications can acquire an XRFrame by calling requestAnimationFrame() on an XRSession with an XRFrameRequestCallback. When the callback is called it will be passed an XRFrame.
Each XRFrame represents the state of all tracked objects for a given time, and either stores or is able to query concrete information about this state at the time.
So this problem will manifest itself only when run in immersive-vr mode and after calling xrSession.requestAnimationFrame( myRenderFunction ) to replace the window.requestAnimationFrame function.
Please visit this page:
https://www.otakhi.com/petridish?load=15802
Open developer console. Once web page loaded, press the play button at bottom right. follow by pressing the XR button at bottom right.
It will break at debugger command in the script. You can then step into getJointPose to see why it is null.
Thanks for creating the demo.
Would you mind if trying https://threejs.org/examples/webxr_vr_handinput.html (not https://threejs.org/examples/#webxr_vr_handinput) ?
It seems hand doesn't work in iframe. (Unrelated to blob in this case.) IIRC it worked before so I guess I broke it at some point... Let me investigate.
Yep. https://threejs.org/examples/webxr_vr_handinput.html works while https://threejs.org/examples/#webxr_vr_handinput does not.
Cool, we finally have found the root issue. Thanks for bearing with me!
Fixed the problem in #271 and applied the same change to HandWIP. I think hand works with your example now.
In the current status, is it compatible with events, namely pinchstarted
, pinchended
and pinchmoved
, of hand-tracking-controls
in AFrame? https://github.com/aframevr/aframe/blob/master/docs/components/hand-tracking-controls.md#events
Specification https://immersive-web.github.io/webxr-hand-input/
Probably using hand pose recognition (eg. https://blog.tensorflow.org/2020/03/face-and-hand-tracking-in-browser-with-mediapipe-and-tensorflowjs.html) as UI would be good because controlling hand object with mouse may be useless.