How should layers behave when there's more than two views of the scene?

asajeffrey commented 4 years ago

For example, devices may have a third view for an on-board camera, used when generating screen recordings (so the XR content lines up with the real-world content).

asajeffrey commented 4 years ago

We currently have stereo, stereo-left-right, etc. but we could replace "stereo" by "however many views the device has".

cabanier commented 4 years ago

This is currently unsupported. I'm unaware of systems that render to another view to do screen recordings.

I think more work is needed in the WebXR spec to support this especially since poses/timestamps will be different so it's more than just the matrices being different.

asajeffrey commented 4 years ago

We could future-proof this spec though, by not hard wiring in stereo everywhere?

asajeffrey commented 4 years ago

That said, yes, there are nasty questions about how to do things like screen recording when the recording device may be at a different resolution and refresh rate.

cabanier commented 4 years ago

We could future-proof this spec though, by not hard wiring in stereo everywhere?

Because there is a lot of content that is set up to do left-to-right or top-to-bottom. If we ever need to support more than 2 views, we can extend the spec.

asajeffrey commented 4 years ago

I think I'm suggesting two things:

Defining a constant in the spec for the number of views the device has, and saying that for now that number is 2. Then replace the uses of "2" and "double" etc. in the spec by references to that constant.
Bikeshedding the name stereo since at some point we might want to allow other values for that constant.

I'm more concerned with the first point, which is making it easier to make the spec extension, since there will only one one place that needs changed rather than quite a few.

cabanier commented 4 years ago

I'd rather wait until there's more clarity on what is actually needed. I'm pretty sure that a camera view is going to need its own renderloop with a different framerate so it would be better to solve that specific problem.

Manishearth commented 4 years ago

If we ever need to support more than 2 views, we can extend the spec.

We do now.

The Hololens 2 supports an "secondary view" (see the openxr extensions for secondary views, and the observer view configuration. They work at the same framerate, it's for video capture and casting: because the Hololens has no camera feed centered on each eye, recordings it produces are slightly offset unless this is used.

Varjo VR-2 also has a "bionic display" concept, which has two views for each eye, one big and low-res, one small and high-res. OpenXR similarly has a proposed extension for quad views from varjo. This would similarly be consumed at the same framerate, it seems.

cabanier commented 4 years ago

The Hololens 2 supports an "secondary view" (see the openxr extensions for secondary views, and the observer view configuration. They work at the same framerate, it's for video capture and casting: because the Hololens has no camera feed centered on each eye, recordings it produces are slightly offset unless this is used.

Varjo VR-2 also has a "bionic display" concept, which has two views for each eye, one big and low-res, one small and high-res. OpenXR similarly has a proposed extension for quad views from varjo. This would similarly be consumed at the same framerate, it seems.

I think that support for a camera view or a bionic display, should be specified in different specifications (which would extend the WebXR and Layers spec).

From past experience, getting the scene to line up with a camera capture is not trivial.

Manishearth commented 4 years ago

WebXR is already able to handle this, it just needs to add more views to the view array.

From past experience, getting the scene to line up with a camera capture is not trivial.

I'm not sure how this is a problem? This is an additional view, and has additional poses in getViewerPose().

Having a third view solves the problem of alignment because the underlying system can provide a view offset origin corresponding to the camera.

cabanier commented 4 years ago

WebXR is already able to handle this, it just needs to add more views to the view array.

If you read the overview in the OpenXR spec, much more than just providing a new view needs to be done. For instance, in an additive device, the extra view should use alpha compositing.

From past experience, getting the scene to line up with a camera capture is not trivial.

I'm not sure how this is a problem? This is an additional view, and has additional poses in getViewerPose().

Having a third view solves the problem of alignment because the underlying system can provide a view offset origin corresponding to the camera.

It's not just an alignment in space, it also needs to be done in time. The OpenXR also points this out:

Applications enabling this view configuration must call xrLocateViews a second time each frame to explicitly query the view state for the XR_VIEW_CONFIGURATION_TYPE_SECONDARY_MONO_FIRST_PERSON_OBSERVER_MSFT configuration.

Manishearth commented 4 years ago

The OpenXR also points this out:

xrLocateViews is just about getting the pose information? Calling it a second time is necessary because openxr exposes it as a "second configuration"

cabanier commented 4 years ago

Can you start by getting the WebXR spec updated with a new view type? Then introduce a new feature to turn on this new camera view. We can then also get more information from Microsoft on how they are able to get camera feed to line up with the rendered scene without any shaking or misalignment.

Once we have that, we can update the layers spec. I suspect that all we need is additional color and depth textures.

Manishearth commented 4 years ago

The WebXR spec does not need a new view type, it supports this just fine. You provide three views, one with an eye of "unknown".

The WebXR spec puts no constraints on the number of views: CAVE systems are implicitly supported, though I don't think anyone has made that happen yet.

I still don't see how lining up the camera with the rendered scene is a problem, they just need to expose the view origin as being where the camera is. That's the whole point of this feature.

Manishearth commented 4 years ago

The reason openxr needs to do this as a separate view configuration is because openxr's API is negotiation based -- you tell it what kind of view you want and it gives it to you. On the other hand, webxr will flat out give you whatever views it feels like, and you're supposed to deal with it. So we can surface extra views if we want, easily.

cabanier commented 4 years ago

On the other hand, webxr will flat out give you whatever views it feels like, and you're supposed to deal with it. So we can surface extra views if we want, easily.

It's not that simple and I'm wary to update the layers spec for something that will likely not work well. Write up an explainer or file and issue with your thoughts. Then let's take this up with the group (especially @thetuvix) so we're all on the same page.

@cwilso @AdaRoseCannon , would this type of experience be part of our charter?

Manishearth commented 4 years ago

Our charter doesn't constrain on the kinds of devices we support, it just has examples. Given that there are two devices that support extra views in some form, and given that the core spec already supports this use case at a basic level, this seems to mostly be a layers issue.

An explainer would be premature here: the question is how to fit this into the layers spec.

The core spec issue I see arising from this is:

it might be necessary to have environmentBlendModes on views as well
it would be nice if we could submit multiple XRWebGLLayers for different views so they can be created with different settings.

cabanier commented 4 years ago

I think we should have a different session/renderloop with a new renderstate so we can set the blend mode and match the camera framerate. The layers spec can be changed to allow that (by moving the session around)

The layers themselves and the gl context would be shared between the sessions

cabanier commented 4 years ago

/agenda should the layers spec be updated to support a spectator view?

Manishearth commented 4 years ago

It doesn't seem like the openxr varjo quad view or the camera view require a separate framerate.

asajeffrey commented 4 years ago

@cabanier I know I'm being a broken record here, but one concern is that the number 2 is hard-wired into the layers spec in various places that it isn't in the main webxr spec. At least we should future-proof the spec so that 2 isn't a magic number that's difficult to change.

asajeffrey commented 4 years ago

And while I was filing https://github.com/immersive-web/webxr/issues/1013 I discovered that the webxr spec does have a non-normative note (https://immersive-web.github.io/webxr/#xrviewerpose-interface) saying that spectator mode is supported.

cabanier commented 4 years ago

@cabanier I know I'm being a broken record here, but one concern is that the number 2 is hard-wired into the layers spec in various places that it isn't in the main webxr spec. At least we should future-proof the spec so that 2 isn't a magic number that's difficult to change.

I will address that in a PR

immersive-web / layers

How should layers behave when there's more than two views of the scene? #114