Open AdaRoseCannon opened 3 years ago
So for me, a major part of this is the ability to just make UI using web technology instead of trying to make UI components from scratch. I can create a Vue app with Vuetify and have that serve as a control panel for desktop and VR users alike.
Another major feature is the ability to display and have interactive media, such as screenshare or synced video watching. This allows us to rely on existing web implementations instead of trying to manually program systems for use in virtual worlds.
For example, here is a web app for street videos being used in a VR world. We can and do also throw up movies on similar screens.
Making UI elements with Vue is a good usecase, for the other ones:
In this case videos are best done using WebXR Layers for best performance and user experience.
Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer? Since you set the playback on the video element using:
videoEl.srcObject = remoteStream;
Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer?
Yes, any video element can become the source for a video layer.
Some broad use cases :
A big issue with today's definition of DOM Overlays is that they are drawn on top of the VR scene. Unless we change that, I suspect that they can only be used for non-interactive content.
Unless we change that, I suspect that they can only be used for non-interactive content.
@cabanier why ? because the controllers would be drawn behind the UI ? A solution may be for the browsers to provide a stencil buffer so we can write on it to tell which parts of the top-drawn UI should be trimmed out to show what's behind ? Blind guess, I don't know if it's feasible.
@cabanier - https://immersive-web.github.io/dom-overlays/#xrsessioninit currently says "The DOM content MUST be composited as if it were the topmost content layer. It MUST NOT be occluded by content from the XRWebGLLayer or by images from a passthrough camera for an AR device."
This requirement intentionally does NOT apply to UI elements drawn by the UA directly. For example, it would be OK for the UA to provide a visible pointer ray, target reticle on the DOM layer, and/or a hand/controller model while the user appears to be interacting with the DOM overlay. This is also the only way to support interactions with cross-origin content in an iframe since the application isn't allowed to get poses in that case, so the application couldn't draw a pointer ray that intersects interactive cross-origin content. (See Event handling for cross-origin content. Note that the pose restriction doesn't apply if the UA treats cross-origin content as noninteractive.)
For use cases, in general browsers provide a lot of functionality for HTML pages that would be difficult to replicate in applications, for example:
In general, the floating-screen use case isn't covered in much detail in the current version of the specification. This is open for suggestions and/or additional features, for example it could be useful to provide a per-frame status value if the UA and application should cooperate on drawing pointer rays, or a stencil mask if appropriate.
This requirement intentionally does NOT apply to UI elements drawn by the UA directly. For example, it would be OK for the UA to provide a visible pointer ray, target reticle on the DOM layer, and/or a hand/controller model while the user appears to be interacting with the DOM overlay.
I agree but we would need to change the spec to allow that. There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API
I agree but we would need to change the spec to allow that.
Changing the spec is definitely an option as long as it's not inherently incompatible with current usage. For example, the spec could easily be augmented to say how such elements should be handled for the "floating" type, and clarifying that the UA isn't expected to draw such affordances for the "screen" type where there's no ambiguity about the touch location.
There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API
DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.
A big issue with today's definition of DOM Overlays is that they are drawn on top of the VR scene. Unless we change that, I suspect that they can only be used for non-interactive content.
More than that, it's not possible to make apps and other nifty things and toss them in the world then. For example, there are movie screens and radios.
One of the greatest things about being able to add websites into a scene is that you can take traditional (WebXR unrelated) web content and make it relevant to the world. From livestreams and screenshare to things like internet radio. e.g. http://radio.garden/
There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API
DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.
No, DOM Layers will allow for fully interactive content as long as its same origin. Input and the drawing of the controllers is handled by the UA.
In general, the floating-screen use case isn't covered in much detail in the current version of the specification. This is open for suggestions and/or additional features, for example it could be useful to provide a per-frame status value if the UA and application should cooperate on drawing pointer rays, or a stencil mask if appropriate.
If it helps any: I will say the floating screen use case is extremely important. It allows us to make interesting interactive UI out of 2D web elements.
Something that's always been quite good and interesting was the pioneering of Microsoft with the Cliff House. It took the Hololens interactivity and put that into WMR. I think more UIs should have the ability to snap and place windows but also get them to follow you, letting the user pick and choose what they would like to make a UI element. Think of the power of being able to play with your screen space on Windows or Linux and then imagine that level of control in VR. Spatial web windows allow for us to do something akin to a window manager for VR.
There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API
DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.
No, DOM Layers will allow for fully interactive content as long as its same origin. Input and the drawing of the controllers is handled by the UA.
Thanks for clarifying. Would it be more accurate to say that an implementation or API can basically choose two out of these three features for DOM content, but not all at once?
Thanks for clarifying. Would it be more accurate to say that an implementation or API can basically choose two out of these three features for DOM content, but not all at once?
- interactive
- cross-origin
- arbitrary layer placement and occlusion
Yes :-)
There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API
DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.
No, DOM Layers will allow for fully interactive content as long as its same origin. Input and the drawing of the controllers is handled by the UA.
Thanks for clarifying. Would it be more accurate to say that an implementation or API can basically choose two out of these three features for DOM content, but not all at once?
* interactive * cross-origin * arbitrary layer placement and occlusion
To summarize: It allows us to make interactive in-world panels given they're hosted by the same domain, but also import content in a non-interactive way if it's not "secured" by being on the same domain?
So far, that looks like a quite reasonable starting point, then. :)
To summarize: It allows us to make interactive in-world panels given they're hosted by the same domain, but also import content in a non-interactive way if it's not "secured" by being on the same domain?
What do you mean by 'it'? These are 2 different APIs...
"It" being web content, the two different contexts being a. on the same domain or b. not on the same domain.
What I gather is that if content is on the same domain, it can then be interactive and spatial at the same time. If it's not on the same domain, then it can only be interactive and on the top-most layer; or it can be spatial but non-interactive.
Correct. cross-domain/top rendering only = DOM Overlay same-origin/arbitrary placement = DOM Layers
Awesome, well I will have to say that I am interested in seeing both implementations for WebXR (for VR headsets). Existing web tech can save developers a lot of time where apps and UI are involved.
Youtube players are cross-origin iframes and there seems to be quite some demand to be able to get those into WebXR experiences.
If I understand correctly, that would be possible with DOM Overlay, but does not allow arbitrary placement. Since the player playback is controlled via postMessage()
, non-interactive is fine, but from what I've read, it's impossible to place the YT player against a wall for example. Even without occlusion, it would be nice to still be able to arbitrarily place the player in 3D space.
(See also the issue on the YouTube issue tracker https://issuetracker.google.com/issues/200299143)
CSS2DRendering/CSS3DRendering (such as https://threejs.org/docs/#examples/en/renderers/CSS2DRenderer) could be an interesting use case in combination with DOM Overlays, perhaps leveraging 2D html content with ARIA and all that comes with HTML. I am already using this concept for annotating 3D models and making 3D models more accessible to users in desktop and mobile WebXR applications. It would be great to utilise this in VR/XR as well without having to write "legacy/alternative" components for VR annotations and submenus.
We also have iFrame systems in combination with CSS3D for Youtube players (as mentioned by @Squareys ), which is a nice use case for iFrames amongst other things, but we need to rely on MP4 HLSJs / VideoJS / Dash to texture implementation as the iframing is not compatible with the VR browsers.
Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer?
Yes, any video element can become the source for a video layer.
When I use video layer for webrtc video track, video rendering is stuttering very much. It seems to be not able to control frame pacing
Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer?
Yes, any video element can become the source for a video layer.
When I use video layer for webrtc video track, video rendering is stuttering very much. It seems to be not able to control frame pacing
Can you link to an example?
We recently made some fixes in this area.
it is very big project, so cannot share. it is just render a stereo webrtc stream video on WebXR video layer( quad) on oculus headset.
If rendering sttream by classic way( gl texture), it is fine but quality is worse.
This is an issue on the Oculus browser side. Can you reach out to me on the WebXR discord? There are a couple of things I'd like you to try.
who are you there?
who are you there?
Rik Cabanier (Meta)
Hi, just saw this thread and I will add the use case I am developing. Transition from 2D Web to a 3D universe should be possible through WebXR DOM overlay (I'm betting the farm on it...). I'm doing my R&D developing https://umniverse.com (upper right 3D button). DOM Overlay was a great idea, hope W3C and browser developers explore its full potential.
Incredibly useful for business. Remote VR meetings to be able to see and interact with web pages inside a virtual conference room space. We are also using Appetize.io to project real functioning mobile devices inside browsers (and with some limited success, inside Spatial.io from within an Oculus headset) so that visitors do not need to download the actual app to their phone to experiment with it. We want the future of work to be possible inside VR, using WebXR so that native headset apps are not required to be installed.
Might DOM Layers possibly arrive to VR (in the spec, in Oculus Browser or elsewhere) in 2024? What's the status of this, please?
It would be useful to collect some use cases for how DOM Overlay's are being used in VR for potential implementations.
This example by @mrdoob is a great use case: https://twitter.com/mrdoob/status/1385184290867187715