Closed asajeffrey closed 4 years ago
We currently have stereo
, stereo-left-right
, etc. but we could replace "stereo" by "however many views the device has".
This is currently unsupported. I'm unaware of systems that render to another view to do screen recordings.
I think more work is needed in the WebXR spec to support this especially since poses/timestamps will be different so it's more than just the matrices being different.
We could future-proof this spec though, by not hard wiring in stereo
everywhere?
That said, yes, there are nasty questions about how to do things like screen recording when the recording device may be at a different resolution and refresh rate.
We could future-proof this spec though, by not hard wiring in
stereo
everywhere?
Because there is a lot of content that is set up to do left-to-right or top-to-bottom. If we ever need to support more than 2 views, we can extend the spec.
I think I'm suggesting two things:
stereo
since at some point we might want to allow other values for that constant.I'm more concerned with the first point, which is making it easier to make the spec extension, since there will only one one place that needs changed rather than quite a few.
I'd rather wait until there's more clarity on what is actually needed. I'm pretty sure that a camera view is going to need its own renderloop with a different framerate so it would be better to solve that specific problem.
If we ever need to support more than 2 views, we can extend the spec.
We do now.
The Hololens 2 supports an "secondary view" (see the openxr extensions for secondary views, and the observer view configuration. They work at the same framerate, it's for video capture and casting: because the Hololens has no camera feed centered on each eye, recordings it produces are slightly offset unless this is used.
Varjo VR-2 also has a "bionic display" concept, which has two views for each eye, one big and low-res, one small and high-res. OpenXR similarly has a proposed extension for quad views from varjo. This would similarly be consumed at the same framerate, it seems.
The Hololens 2 supports an "secondary view" (see the openxr extensions for secondary views, and the observer view configuration. They work at the same framerate, it's for video capture and casting: because the Hololens has no camera feed centered on each eye, recordings it produces are slightly offset unless this is used.
Varjo VR-2 also has a "bionic display" concept, which has two views for each eye, one big and low-res, one small and high-res. OpenXR similarly has a proposed extension for quad views from varjo. This would similarly be consumed at the same framerate, it seems.
I think that support for a camera view or a bionic display, should be specified in different specifications (which would extend the WebXR and Layers spec).
From past experience, getting the scene to line up with a camera capture is not trivial.
WebXR is already able to handle this, it just needs to add more views to the view array.
From past experience, getting the scene to line up with a camera capture is not trivial.
I'm not sure how this is a problem? This is an additional view, and has additional poses in getViewerPose()
.
Having a third view solves the problem of alignment because the underlying system can provide a view offset origin corresponding to the camera.
WebXR is already able to handle this, it just needs to add more views to the view array.
If you read the overview in the OpenXR spec, much more than just providing a new view needs to be done. For instance, in an additive device, the extra view should use alpha compositing.
From past experience, getting the scene to line up with a camera capture is not trivial.
I'm not sure how this is a problem? This is an additional view, and has additional poses in
getViewerPose()
.Having a third view solves the problem of alignment because the underlying system can provide a view offset origin corresponding to the camera.
It's not just an alignment in space, it also needs to be done in time. The OpenXR also points this out:
Applications enabling this view configuration must call xrLocateViews a second time each frame to explicitly query the view state for the XR_VIEW_CONFIGURATION_TYPE_SECONDARY_MONO_FIRST_PERSON_OBSERVER_MSFT configuration.
The OpenXR also points this out:
xrLocateViews is just about getting the pose information? Calling it a second time is necessary because openxr exposes it as a "second configuration"
Can you start by getting the WebXR spec updated with a new view type? Then introduce a new feature to turn on this new camera view. We can then also get more information from Microsoft on how they are able to get camera feed to line up with the rendered scene without any shaking or misalignment.
Once we have that, we can update the layers spec. I suspect that all we need is additional color and depth textures.
The WebXR spec does not need a new view type, it supports this just fine. You provide three views, one with an eye
of "unknown"
.
The WebXR spec puts no constraints on the number of views: CAVE systems are implicitly supported, though I don't think anyone has made that happen yet.
I still don't see how lining up the camera with the rendered scene is a problem, they just need to expose the view origin as being where the camera is. That's the whole point of this feature.
The reason openxr needs to do this as a separate view configuration is because openxr's API is negotiation based -- you tell it what kind of view you want and it gives it to you. On the other hand, webxr will flat out give you whatever views it feels like, and you're supposed to deal with it. So we can surface extra views if we want, easily.
On the other hand, webxr will flat out give you whatever views it feels like, and you're supposed to deal with it. So we can surface extra views if we want, easily.
It's not that simple and I'm wary to update the layers spec for something that will likely not work well. Write up an explainer or file and issue with your thoughts. Then let's take this up with the group (especially @thetuvix) so we're all on the same page.
@cwilso @AdaRoseCannon , would this type of experience be part of our charter?
Our charter doesn't constrain on the kinds of devices we support, it just has examples. Given that there are two devices that support extra views in some form, and given that the core spec already supports this use case at a basic level, this seems to mostly be a layers issue.
An explainer would be premature here: the question is how to fit this into the layers spec.
The core spec issue I see arising from this is:
I think we should have a different session/renderloop with a new renderstate so we can set the blend mode and match the camera framerate. The layers spec can be changed to allow that (by moving the session around)
The layers themselves and the gl context would be shared between the sessions
/agenda should the layers spec be updated to support a spectator view?
It doesn't seem like the openxr varjo quad view or the camera view require a separate framerate.
@cabanier I know I'm being a broken record here, but one concern is that the number 2 is hard-wired into the layers spec in various places that it isn't in the main webxr spec. At least we should future-proof the spec so that 2 isn't a magic number that's difficult to change.
And while I was filing https://github.com/immersive-web/webxr/issues/1013 I discovered that the webxr spec does have a non-normative note (https://immersive-web.github.io/webxr/#xrviewerpose-interface) saying that spectator mode is supported.
@cabanier I know I'm being a broken record here, but one concern is that the number 2 is hard-wired into the layers spec in various places that it isn't in the main webxr spec. At least we should future-proof the spec so that 2 isn't a magic number that's difficult to change.
I will address that in a PR
For example, devices may have a third view for an on-board camera, used when generating screen recordings (so the XR content lines up with the real-world content).