Support Hand controllers input

takahirox commented 3 years ago

Specification https://immersive-web.github.io/webxr-hand-input/

Probably using hand pose recognition (eg. https://blog.tensorflow.org/2020/03/face-and-hand-tracking-in-browser-with-mediapipe-and-tensorflowjs.html) as UI would be good because controlling hand object with mouse may be useless.

takahirox commented 3 years ago

I'm trying https://github.com/tensorflow/tfjs-models/tree/master/handpose

30fps at my work Windows laptop and 15fps at my personal Windows laptop. Probably we need to run it in Workers.

takahirox commented 3 years ago

It seems Handpose currently detects up to one hand, doesn't two or more hands. But one hand would be good start.

ionif commented 3 years ago

Hey there! I've been thinking about how to implement this for some time and would love to help. The latest Mediapipe supports multiple hands with only 2D landmarks but the last version had 3D so I think they will update it soon. In any case, I think it would make the most sense to implement it or the Tensorflow.js handpose model which has 3D landmarks but only for one hand. Fortunately the keypoints for both are the same (apart from missing Z) so the emulation will be the same. Here's a graphic I made contrasting them: MediapipeWebXRLandmarks

The hand models that Aframe uses are located here: https://cdn.aframe.io/controllers/oculus-hands/unity/left.glb, https://cdn.aframe.io/controllers/oculus-hands/unity/right.glb so we could potentially use them instead of controllers in DevTools.

I think we would start emulating XRHands in EmulatedXRDevices instead of a gamepadInputSource but I could be wrong. Is the EmulatedXRDevices the main file I need to worry about?

Thanks, Alex

takahirox commented 3 years ago

Thanks for the comment.

I think we would start emulating XRHands in EmulatedXRDevices instead of a gamepadInputSource but I could be wrong. Is the EmulatedXRDevices the main file I need to worry about?

To figure out the change, I'm trying to make an easy prototype now. Please give me some time.

ionif commented 3 years ago

Sounds good. It looks like XRHand isn't implemented in the polyfill also so I've been looking into that.

takahirox commented 3 years ago

And we are discussing how to capture webcam in the extension in #262. If you are interested in, please join us there.

takahirox commented 3 years ago

Progress, the prototype works now https://twitter.com/superhoge/status/1349560469837672452

monoto commented 3 years ago

Testing your excellent Hand Tracking Implementation. However I'm unable to get the joint pose. Anything wrong with the following code?

var ds = xr.inputSources;
if (ds)
{
   ds.forEach( p => {
      if (p.hand instanceof XRHand && p.handedness == 'right')
      {
         var joints = p.hand.joints;
         for (var key in boneMap)
         {
            var jointSpace = joints.get(key);
            var pose = xr.frame.getJointPose( jointSpace, xr.refSpace );
            if (pose)
            {
               console.log(pose);   // pose is always null even when right hand in camera.
            }
         }
      }
   });
}

takahirox commented 3 years ago

I haven't written the document yet but first please check how to use the hand input from the following video.

https://twitter.com/superhoge/status/1356083004754468864

Choose "Oculus Quest"
Check "Hand input" check box
Click "Start WebCam" button
Click "Start PIP" button
Confirm wire is placed on your hand in PIP

And then try the Three.js Hand input example as I use for the test

https://threejs.org/examples/#webxr_vr_handinput

Let me know if the hand input support still doesn't work.

monoto commented 3 years ago

I checked out the latest version from branch HandWIP.

When I ran three.js hand tracking demo, the developer console says: "The optional feature 'bounded-floor' is not supported' and no hand model shows up.

When I ran my own test app. It says: "The optional feature 'hand-tracking' is not supported"?

Sorry to trouble you again, but what's the proper features needed be requested for it to work?

var opt = {sessionType: 'immersive-vr',
            referenceSpaceType: 'local',
            framebufferScale: 1.0,
            depthNear: c.pref.near,
            depthFar: c.pref.far,
                optionalFeatures: ['hand-tracking']
          };

If I uncheck Stereo Effect, both of these Errors went away and red wires show up in PIP. But still, I am getting Null for pose and no joints show up in Three.js example. Is there a specific version of Chrome that I need to be using? Thank you.

monoto commented 3 years ago

Stepped into your code. EmulatedXRDevice is constructed on page load and hand-tracking is enabled correctly..... But for some reason, _baseMatrix and _inverseBaseMatrix are not populated. and hence getJointPose always returns null.

monoto commented 3 years ago

Looks like I need oculus browser for it to work... the only browser with hand tracking build in.

takahirox commented 3 years ago

Would you find the following line in webxr-polyfill.js and set break point?

this.handGamepadInputSources[handIndex].inputSource.hand.get(jointName)._baseMatrix = m;

monoto commented 3 years ago

That line gets hit per frame. and _baseMatrix is populated at that point.

_updateHandPose(matrixArray, handIndex, jointName) {
    if (!this.hasHandControllers ||
      handIndex >= this.handGamepadInputSources.length ||
      !XRHandJoint[jointName]) {
      return;
 }
    const m = create$6();
    for (let i = 0; i < 16; i++) {
      m[i] = matrixArray[i];
    }
    this.handGamepadInputSources[handIndex].inputSource.hand.get(jointName)._baseMatrix = m;  <- gets updated here
  }

But somehow when I call getJointPose, _baseMatrix and _inverseBaseMatrix are both null. Almost like they are from different memory space. Updated to one copy and read from another.

I put a break point where this[Private].baseMatrix = null, but it never happened between read and write. STRANGE.

Also notice XRHandJoint['wrist'] is 0 therefore the condition !XRHandJoint[jointName] will not allow wrist to be updated.

monoto commented 3 years ago

Is it possible that I'm calling XRFrame.getJointPose from within requestAnimationFrame's render function. And at that time, XRFrame provides a snapshot of current state of inputDevices that never gets updated by a separated webxr-hand-pose event handler? They are hardly in sync since XRFrame is only valid during the render call? Perhaps I need to call _updateHandPose from requestAnimationFrame's render function instead of being event-driven?

takahirox commented 3 years ago

Also notice XRHandJoint['wrist'] is 0 therefore the condition !XRHandJoint[jointName] will not allow wrist to be updated.

Good catch. I fixed it.

But somehow when I call getJointPose, _baseMatrix and _inverseBaseMatrix are both null. Almost like they are from different memory space. Updated to one copy and read from another.

I put a break point where this[Private].baseMatrix = null, but it never happened between read and write. STRANGE.

Hm, that's weird. And I can't reproduce the problem here...

monoto commented 3 years ago

Are you running XR in immersive-vr mode? Your demo video doesn't show double screen.

takahirox commented 3 years ago

Yes, but unchecked Stereo effect.

monoto commented 3 years ago

Looks like Mediapipe is not providing z-depth at this point, which is needed for animating a 3D hand.

takahirox commented 3 years ago

It's a known limitation and should be unrelated to the null pose issue you encountered.

monoto commented 3 years ago

Give me some time to make you a test case and you can check if something wrong with my code or like I said, something to do with how requestAnimationFrame is using render function to pass in a snapshot of XRFrame and inputSources.

An XRFrame represents a snapshot of the state of all of the tracked objects for an XRSession. Applications can acquire an XRFrame by calling requestAnimationFrame() on an XRSession with an XRFrameRequestCallback. When the callback is called it will be passed an XRFrame.

Each XRFrame represents the state of all tracked objects for a given time, and either stores or is able to query concrete information about this state at the time.

So this problem will manifest itself only when run in immersive-vr mode and after calling xrSession.requestAnimationFrame( myRenderFunction ) to replace the window.requestAnimationFrame function.

monoto commented 3 years ago

Please visit this page:

https://www.otakhi.com/petridish?load=15802

Open developer console. Once web page loaded, press the play button at bottom right. follow by pressing the XR button at bottom right.

It will break at debugger command in the script. You can then step into getJointPose to see why it is null.

takahirox commented 3 years ago

Thanks for creating the demo.

Would you mind if trying https://threejs.org/examples/webxr_vr_handinput.html (not https://threejs.org/examples/#webxr_vr_handinput) ?

It seems hand doesn't work in iframe. (Unrelated to blob in this case.) IIRC it worked before so I guess I broke it at some point... Let me investigate.

monoto commented 3 years ago

Yep. https://threejs.org/examples/webxr_vr_handinput.html works while https://threejs.org/examples/#webxr_vr_handinput does not.

takahirox commented 3 years ago

Cool, we finally have found the root issue. Thanks for bearing with me!

takahirox commented 3 years ago

Fixed the problem in #271 and applied the same change to HandWIP. I think hand works with your example now.

Utopiah commented 1 year ago

In the current status, is it compatible with events, namely pinchstarted, pinchended and pinchmoved, of hand-tracking-controls in AFrame? https://github.com/aframevr/aframe/blob/master/docs/components/hand-tracking-controls.md#events

MozillaReality / WebXR-emulator-extension

Support Hand controllers input #254