Closed asajeffrey closed 4 years ago
That is correct. I believe I brought this up before and I was told that Hololens works around this by just allocating the same size texture for the camera as for the eyes. We also don't handle things like different blend modes and frame rate.
If we're going to properly support a camera view, we should add explicit support in the WebXR spec.
@Manishearth does the hololens camera view have the same resolution as the eye displays?
Blend mode is definitely an issue, c.f. https://github.com/immersive-web/webxr-ar-module/issues/53
Frame rate is a tricky one, I'd expect lots of content assumes there's only one rAF cadence, and will be quite surprised if different rAF callbacks have different views.
I believe the hololens matches frame rates (either by reducing eye framerate when recording or by ignoring every other camera frame).
The resolutions are indeed different, but this isn't a big deal, views can have different sizes.
I believe the hololens matches frame rates (either by reducing eye framerate when recording or by ignoring every other camera frame).
The resolutions are indeed different, but this isn't a big deal, views can have different sizes.
This means that texture arrays are not going to work in this workflow. This is another indication that we should treat the observer view differently from the regular views.
To me that seems like it's an indication that the texture array approach could be improved :smile: . Observer views aren't the only such example, the Varjo quad display is also one, as are potential CAVE systems. The observer view is just the one I've actually been working with.
To me that seems like it's an indication that the texture array approach could be improved 😄 .
WebGL defines that all texture in a texture array are the same size. If different views have different sizes, we need to specifiy that the UA has to reject texture arrays for projection layers. (I will add some spec text to clarify this)
A lot of experiences are going to break if the recommended workflow isn't working anymore. As @thetuvix mentioned, a UA might have to work around this by calculating the observer view itself or letting the experience set up a different rendering pipeline for the observer view.
Right, so perhaps the solution there is to allow multiple projection layers, or for a projection layer to have multiple texture arrays. "Letting the experience set up a different rendering pipeline" is exactly the issue here :smile:
I think there should be a new session for the observer with its own frame rate, blend mode and views array. Tinkering with the textures will result in confusing logic in the spec and eventually author code.
Hmm, that's an interesting idea, having more than one session per experience. At the moment, sessions are initiated by the content, how would it work if the device wanted to use more than one session? I suspect a lot of code will break if there's more than one session object.
Right, it feels like the openxr model for this: you have a single session with multiple "view configurations" that can be addressed independently is the right one.
I think multiple sessions will be way more confusing to deal with both in spec and author code -- there's so much cross-synchronization that would need to be done.
In terms of different frame rates... are there devices with different frame rates where there's not a single main frame rate we can use? E.g. for devices with a camera, the frame rate of the headset display, not the camera.
In terms of different frame rates... are there devices with different frame rates where there's not a single main frame rate we can use? E.g. for devices with a camera, the frame rate of the headset display, not the camera.
Magic Leap's display runs at 120fps. I'm unsure what the frame rate of its camera is but it's surely much lower. @thetuvix said that Hololens renders all frames for the camera but ends up throwing most of them away which is not ideal.
Moreover, it will be hard to match up predicted camera poses with predicted viewer ones which will make it hard to avoid jittering.
@thetuvix said that Hololens renders all frames for the camera but ends up throwing most of them away which is not ideal.
Right, but that's already a choice made by Hololens, we're not making that choice for them.
I think multiple sessions will be way more confusing to deal with both in spec and author code -- there's so much cross-synchronization that would need to be done.
No, I think this will be far less confusing because it allows you to break up your logic at a very high level. Game logic + your existing stereo renderer + your existing mono renderer vs Game logic + a hybrid mono/stereo renderer filled with if/else blocks
@thetuvix said that Hololens renders all frames for the camera but ends up throwing most of them away which is not ideal.
Right, but that's already a choice made by Hololens, we're not making that choice for them.
Can you elaborate? Are you saying we should also drop frames?
We could provide that info to content providers, e.g. when content requests the subimage for a view, we could provide a flag saying "this subimage will be thrown away". If we wanted to be more agressive about it, we could return a null
subimage, though I suspect this would result in a lot of content throwing exceptions.
I should probably give an example of what I meant by having a main frame rate...
Imagine a headset that runs at 120fps, with a camera running at 25fps. If we tried matching both frame rates, we'd end up running (if I've done the math correctly) 140fps, with uneven gaps between the frames. But... I'm not sure such devices exist! I suspect that most devices have the secondary display running at a fraction of the primary (e.g. 30fps rather than 25fps).
I think we should discuss this in a call. I'm wary to add a lot of special case code if we can solve it in a cleaner way.
/agenda how should we handle views from different devices (ie eye displays + observer)
I'm not sure I'd word it that way, for the case of HoloLens, there's only one device but it's got more than two views, and they have different properties (e.g. resolution, alpha blend, framerate,...).
I'm not sure I'd word it that way, for the case of HoloLens, there's only one device but it's got more than two views, and they have different properties (e.g. resolution, alpha blend, framerate,...).
OK :-) More than one type of display per session.
Maybe I should open an issue on the WebXR spec and propose a "observer" session than can run concurrently with an immersive one.
I really think that an observer session would be super heavyweight.
Furthermore, as mentioned before, we need to deal with this for quad views and CAVE anyway. My understanding is that the webxr spec was intentionally designed with an uncapped and unconstrained number of views. We should try and handle this without recapping that number to two.
I really think that an observer session would be super heavyweight.
Furthermore, as mentioned before, we need to deal with this for quad views and CAVE anyway. My understanding is that the webxr spec was intentionally designed with an uncapped and unconstrained number of views.
I think we can deal with quad views and CAVE since those systems all render at the same framerate, blend mode, time warp, etc. Observer views are different and yes, they are heavy weight. There is no way around it. Simply adding a view and expecting it to render correctly won't work (as @thetuvix also mentioned).
I think we can deal with quad views and CAVE since those systems all render at the same framerate
quad views also have differing resolutions (not frame rates), which is the issue in question here. The framerate thing is a separate issue.
The point is, things that work based off a texture array will need tweaking if we want them to work on systems with different sizes of view. Probably by accepting multiple texture arrays, though we can also declare we want to defer solving this problem and spec it to error out or only work with the primary views and expect content to fall back to regular textures.
Simply adding a view and expecting it to render correctly won't work (as @thetuvix also mentioned).
Works fine on hololens' openxr implementation. The dropping of frames is suboptimal but that's a choice made by the system.
Fwiw you actually can handle multi-framerate views by sending down different per-frame view arrays based on whether or not the observer view needs to be rendered. Hololens doesn't seem to expose the bit of "is this frame going to be thrown away", so we didn't implement it that way, but a device with a different observer framerate that wishes to make this optimization totally can do that.
though we can also declare we want to defer solving this problem and spec it to error out or only work with the primary views and expect content to fall back to regular textures.
Yes, I'm ok with deferring. PR #159 allows different texture size so that should address some of the concerns
Simply adding a view and expecting it to render correctly won't work (as @thetuvix also mentioned).
Works fine on hololens' openxr implementation.
No. @thetuvix explicitly said that that scenario didn't work and that they ended up assembling the camera view themselves or let the application explicitly code for it.
Agreed that we should have a call here!
Some quick notes on how HoloLens 2 works:
"additive"
, while the blend mode for the mono observer view is "alpha-blend"
.PhotoVideo
view configuration explicitly opt-in. That way, if the app hasn't tested for this case and doesn't opt-in, we can still do the least-bad "distort one of the eyes" approach, which is at least better than incorrect rendering, black rendering or a crash. That same incremental approach could work for WebXR - keep the existing "bag of views", but only enumerate optional secondary views when an app does the appropriate hardening and then uses a module-defined API to opt in. When the app does opt-in, add an attribute to each view (defined in that module) to let the app know which is which each frame.shouldRender
bit each frame to let the app know whether pixels it produces for that view will be used or not. This would allow a system with a 90fps display and a 30fps observer camera to tell the app which 2 of each 3 frames don't require an observer view. We skipped that for now in our vendor extension since we'd always return true
on HoloLens 2, but that sounds like a fine addition in a cross-vendor layers module here. I would not recommend beyond that to have truly independent frame loops - as discussed above, that is a far heavier lift for apps and engines from a performance, architecture and confusion perspective.@cabanier the thing that didn't work is "just giving the application an extra view", which apps are typically not prepared for. the current plan for WebXR is to make this an opt-in via a feature instead. However, it will still be "just another view".
The concern raised by Alex is already resolved if we go with the feature route.
I should make the PR for that extra feature so we aren't talking about hypotheticals :smile:
@cabanier the thing that didn't work is "just giving the application an extra view", which apps are typically not prepared for. the current plan for WebXR is to make this an opt-in via a feature instead. However, it will still be "just another view".
I'm unconvinced that adding another view and reusing the same rendering path is a good solution. For one, maybe HL could get by at rendering the scene at 30fps but that it is not a good solution as such a low framerate will cause swim and user discomfort (especially if they are used to high framerates such as 120fps)
A new session will give us everything we need and it can be easily defined. It's true that the drawback is that this introduces 2 render loop but I suspect that each will be simpler.
The framerate issue has a solution: send down the third view for only some of the raf frames. There's nothing that prevents this, and when I write the text for the first person observer view I hope to call it out. Hololens has made a choice in which this isn't necessary because it's always pegged to a lower fps, but other devices can choose otherwise.
Multiple sessions is not an easy thing to define. We have a concept of exclusive access to the device, and a lot of spec text is written assuming this, especially given that most backends have a concept of exclusive access. Multiple sessions will complicate the spec and complicate implementations. Multiple views is already supported by the core spec, and while we need an opt in so people don't shoot themselves in the foot, I strongly feel this is the path we should be taking.
I would rather not make such a drastic change to the core spec just to avoid doing some work on the layers spec. Supporting multiple view configurations with texture array projection layers is not impossible! It needs an API to be designed, but it can be designed, and that can be done in parallel with all the other work.
As discussed above, fractional frame cadences for secondary views seem fine to me. If HoloLens had the headroom to keep rendering the primary views at 60fps, we would do as @Manishearth is suggesting and still render the observer view at 30fps by just excluding it from rendering every other frame. That way, the app can continue running its update loop at 60Hz and render at a steady cadence.
Running two independent render loops on arbitrary out-of-phase cadences also requires the simulation update for your scene to tick at those arbitrary times as well. Since you must serialize simulation in most engines, you can't just run two independent loops as you might try for rendering. This means your engine would end up with one unified update loop anyway, except now it jitters between phase A and phase B. I expect it's that impact on update rather than render where independent frame loops would break the architecture of most render engines.
I'm unconvinced that adding another view and reusing the same rendering path is a good solution. For one, maybe HL could get by at rendering the scene at 30fps but that it is not a good solution as such a low framerate will cause swim and user discomfort (especially if they are used to high framerates such as 120fps)
@cabanier I am too unsure of rendering stereo at 30 fps, let me check internally and get back.
OK. I can see that there is not much support for 2 different render loop. Thinking about it more, different loops would also require new layers for the extra session which would be a bit annoying.
I'm still a bit hesitant for cases where the camera is not a fractional cadence of the display but maybe we can make it that we lower the device or camera framerate so it can match. @Manishearth @asajeffrey PR #160 should address cases where views have different resolutions. Can you take a look?
The layers spec doesn't currently support devices with different resolutions for different views (e.g. the recording view is different from the left and right eyes). At the moment, when textures are allocated (https://immersive-web.github.io/layers/#allocate-color-textures and ditto depth/stencil), all views are the same size.