immersive-web / proposals

Initial proposals for future Immersive Web work (see README)
95 stars 11 forks source link

Allow the sites to request AR sessions using front-facing camera #78

Open bialpio opened 1 year ago

bialpio commented 1 year ago

Summary

Provide a way for the sites to request an AR session that uses a user-facing camera if it is available.

Example use cases

Potential use cases:

This assumes that the sites can leverage the information we already provide in AR sessions since we do not plan on enabling face detection with this proposal. Depth sensing / hit-test should be sufficient for interacting with the user directly for various physics-based effects.

A rough idea or two about implementation

One approach would be to expose this as an additional session feature. This is mostly due to the fact that some other features can become unavailable in user-facing AR sessions. An additional thing to solve would be to ensure that the sites can either tolerate mirrored images (users usually expect selfie camera images to be mirrored), or configure it.

See also: immersive-web/administrivia#190, immersive-web/webxr-ar-module#64.

cabanier commented 1 year ago

If you want to use this for face or eye effects, would this make sense for VR as well? Also, is there a already a way to ask for the front camera? What would this unlock more than a video feed?

Yonet commented 1 year ago

/agenda vote for repo owner for the repo: https://github.com/immersive-web/front-facing-camera

AdaRoseCannon commented 1 year ago

I'd love faces as a tracked object in a mode like this to allow developers to make simple filters like accessories and such in a safe privacy preserving way

cabanier commented 1 year ago

I'd love faces as a tracked object in a mode like this to allow developers to make simple filters like accessories and such in a safe privacy preserving way

If we want face effects, it would be better to have a direct API to do so.

DRx3D commented 1 year ago

Facial effects/features are being worked by other groups (Khronos, VRM, Snap) are looking at facial detection, identification, and anchors. It would be good to make sure the work is all compatible.

cabanier commented 1 year ago

Facial effects/features are being worked by other groups (Khronos, VRM, Snap) are looking at facial detection, identification, and anchors. It would be good to make sure the work is all compatible.

Our APIs will likely mirror the ones from OpenXR with additional privacy controls

bialpio commented 1 year ago

Also, is there a already a way to ask for the front camera? What would this unlock more than a video feed?

I don't think we already have a way of entering AR session with front-facing camera, hence the proposal. Ideally, it'd enable us to provide access to most (if not all) already existing AR-centric features, but that's something we'd like to look into.

I'd love faces as a tracked object in a mode like this to allow developers to make simple filters like accessories and such in a safe privacy preserving way

I recall we have been toying with a face-tracking API a while ago but I don't have the full story there. If there's interest in this, it may be something worth looking into again? FYI @alcooper91.

Facial effects/features are being worked by other groups (Khronos, VRM, Snap) are looking at facial detection, identification, and anchors. It would be good to make sure the work is all compatible.

Agreed, we need to make sure that whatever we come up with here won't close the doors to other features. I think it may be worth imagining that e.g. "face-detection" exists and the challenge is to ensue that "front-facing" feature is not colliding with it (e.g. let's assume we have a session request that requires "face-detection" - IMO it implies that "front-facing" feature is also requested). I'll take it into account when writing the explainer, stay tuned.

alcooper91 commented 1 year ago

At the time I looked into face detection (which is exposed by ARKit and ARCore); I believe I had proposed integration with the WebRTC APIs, primarily because front facing didn't exist for WebXR, and at the time I didn't think we'd really get any XR features included with it. Further, the use cases from the most interested parties I was exploring weren't on WebXR (I was primarily talking with Video Conferencing sites). I believe there may be more support for XR features now such that I'd have a different opinion on how to go about doing this. Regardless, I don't think the shape of the Face Detection structures I was proposing really change, just where they are if you want me to link to that old explainer.

There's perhaps an interesting intersection/consideration here of "what about features that require being in front-facing mode" (or require not being in it) that we should address as well.

cabanier commented 1 year ago

I believe there may be more support for XR features now such that I'd have a different opinion on how to go about doing this. Regardless, I don't think the shape of the Face Detection structures I was proposing really change, just where they are if you want me to link to that old explainer.

Yes, we're interested in supporting eye and face tracking. Could you link to the explainer?

alcooper91 commented 1 year ago

This was the old explainer: https://github.com/alcooper91/face-mesh/blob/master/README.md

I had also originally discussed this explainer here: https://github.com/immersive-web/administrivia/issues/125

The FaceMesh interface is probably the only re-usable bit and even that could use some tweaking (e.g. the position/orientation DOMPointReadOnly should likely be removed and can be queried via getPose).

Any face-detection specific features are probably worth discussing in a separate issue in the front-facing repo; but I mention it because it's something that I think wouldn't be supported on the "rear facing" camera and so there may be some additional reasoning we should do about interactions between requested features and how those are resolved. (We've kind of thought of this some with front-facing-camera as not all features may be enabled in this 'mode' and we've kind of set-up a "rear-facing" camera default stance; but I think it's an important consideration as we develop features how to handle something like "Feature X requires Feature Y")

cabanier commented 1 year ago

Thanks @alcooper91! Interesting how this is different from OpenXR. There you get a list of "expressions" and it's up to the author to map those to an avatar. So, there are no "poses", just values associated with an expression (which include eye location)

bialpio commented 1 year ago

Any face-detection specific features are probably worth discussing in a separate issue in the front-facing repo

FWIW, I'd like to limit the scope of this feature to just "AR using front-facing camera". We need to ensure that the features are not stomping over each other, but I think more concrete discussions about "face-tracking" / "face-detection" features should not be something that we have full agreement on before making progress on "front-facing" feature.

Edit: to quote from the initial issue: "we do not plan on enabling face detection with this proposal". :)

alcooper91 commented 1 year ago

+1 Agree they shouldn't block. Was trying to ensure I didn't derail this issue with discussion of "face-tracking"/"face-detection". Those should likely me discussed in other issues in this current repro actually.

Though as I mentioned I do think we should consider how such features might interact with front-facing in terms of "features that cannot work unless front facing is enabled", since front-facing is very close to a "mode" (although I don't think we should add it as such).

Edit: FWIW I could imagine other features being in a similar predicament e.g. perhaps Planes and the Semantic labeling we were discussing a few weeks ago

cabanier commented 1 year ago

+1 Agree they shouldn't block. Was trying to ensure I didn't derail this issue with discussion of "face-tracking"/"face-detection". Those should likely me discussed in other issues in this current repro actually.

Definitely, let's keep them separate since they aren't related

Edit: FWIW I could imagine other features being in a similar predicament e.g. perhaps Planes and the Semantic labeling we were discussing a few weeks ago

In that case, semantic labeling would be part of the planes spec so it makes sense to discuss them in the same repo.