immersive-web / proposals

Initial proposals for future Immersive Web work (see README)
95 stars 11 forks source link

Apply DOM to Face #57

Open keverw opened 4 years ago

keverw commented 4 years ago

Hello, I been learning more about VR lately. I know there's the DOM Overlay proposal but seems more about overlaying a flat 2D UI over the 3D area for the entire screen instead of a object itself... My idea is more of applying a DOM to a face of an object. sorta like a texture but mapping back all the events like touching, scrolling, etc.

I think being able to create a div similar to that API and then map that to a face in the 3D world would be useful. So say you were creating a virtual world, you could create a curved mesh but apply a face to it with your settings pane, or create like a heads up display shaped as a phone but map a UI to it emulating the feel of a mobile phone operating system for your more social features maybe. Would allow multiple faces.

So then you could mix the best of both worlds. I know there's UI libraries for say Babylon and other frameworks, but felt they wasn't as powerful as say HTML/CSS such as ease of use, copy and paste, etc. Not trying to pick on them, just coming from a world where used to the flexibility of HTML and CSS. Plus another benefit of this is less learning curve if you are already a web dev and then also some code reuse, in VR maybe I use a curved mesh but on desktop screen mode, could give a flat object but the UI would be consistent.

While this is still early in VR, I think also using the DOM when running on desktop at least helps with making it easier to copy/paste if needed and accessibility. However I'm not too sure if VR will get that far for copy/pasting but could make VR useful for productivity tasks. But the dom on a face idea could also be used for productivity use cases too, like load up a whiteboard, presentation slide show or a collaborative document even without full support depending on the device for copying/pasting, etc...

Mostly thinking about using it for first party, same origin type stuff since can trust what's in your own dom more. I know cross origin embedding is a whole other topic though, I guess with this concept you could probably even just put a iframe in the part of the dom you are placing a face though maybe though if you wanted to do that.

An example could be:

<div id="menuScreen">
   <div class="menuPane">List of the content, your friends list for example</div>
   <div class="menuDock">Friends | Maps | Settings</div>
</div>

Then you could pop up a floating curved mesh and map it to #menuScreen .menuPane and then have another mesh angled at the bottom with the icons and map it to #menuScreen .menuDock.

Or say I make a projector with a smart media board type thing - maybe you seen them in school. Could have a div

<div id="smartBoard">
   Do your own Custom UI Here
</div>

and map it to just #smartBoard.

blairmacintyre commented 4 years ago

A bunch of us have pushed for "DOMtoTexture" for many years, and the idea gets blocked by folks responsible for web privacy. The problem is this: if you can map a DOM element to a face, you can use it as part of the rendering pipeline and render the elements to texture targets, which means (in turn) you could read the data back.

That means, for example, I could render a web page with a pile of links, and look at the color of those links to see if you've visited those pages. Or, I could convince you to type things into text boxes and watch what you type. etc etc.

Various folks push back and suggest that you simply need to keep track of tainted textures, and disallow reading them back. But, then others reply that this would be impractical and difficult to render.

Finally, most modern web engines don't actually do all their rendering in the GPU, they end up doing much of it in memory, so the performance of this would be surprisingly "not as good as you'd hope". Servo is, of course, an exception here.

FWIW, if you can tolerate not supporting all highly interactive DOM content, check out https://github.com/aelatgt/three-web-layer ... it maps DOM elements to a set of three.js elements, takes care of a variety of performance issues, etc. It works great for relatively static content, but is pretty heavy for content that's updating continuously (even then, I've used it for some debugging views on iOS that update each frame).

keverw commented 4 years ago

Interesting, didn't think much about detecting link colors could be a privacy problem... Maybe not allow external links or do something similar to https://developer.mozilla.org/en-US/docs/Web/CSS/Privacy_and_the_:visited_selector

My plan was to handle external links with my own code anyways since it'd probably interrupt the experience. I guess then that'd be like a model overlaid above all the 3D objects on the HUD.

Could disable iframes in the dom use case maybe too? Then the only text inputs you could monitor is text inputs created by your own website, no different then what can be done today with JS. I feel like this could be a very powerful tool, but didn't realize some of the other problems. Kinda a bit disappointed as I imagined it'd be much different with doing UI design, however right now seems like a lot of manual stuff and reinventing the wheel.

rektide commented 4 years ago

This remains one of my biggest hopes, that somehow, some day, some time, the web/dom/hypertext can be hybridized with 3d.

This is such a horrifying limitation, that we refuse to let the two cross.

This is such a beautiful & glorious use case, so wonderful & so imaginative.

Saying only no to it is not acceptable. Maybe we need special permissions or whatever, but this falls in my #ProjectFugu bucket, of something an app can do, which is to combine 2d (DOM/ui elements) & 3d content. It's been a decade since Flight of the Navigator, which put a video element in 3d, & I am exhausted from the stonewalling, from the denial, of such amazing creativity. We cannot allow this endless stonewalling forever. Progress must be made towards enabling these amazing use cases, behind whatever security permissions systems we need to make, if it has to be that way.

cabanier commented 4 years ago

@keverw As @blairmacintyre mentioned, incorporating DOM elements into the 3D scene has been discussed many times.

The security issues are very real which is why we have to design it extremely carefully. At this point, we hope that a combination of advanced DOM Layers along with some of the concepts of DOM Overlays can get us to a state where we can securely render HTML in a VR/AR scene.

keverw commented 4 years ago

Interesting on security, sounds mostly when framing content though or looking at link colors... So maybe limiting that functionality when that div is being used to map to a texture of an object's face. This could be very powerful even if only allowed first origin content.

@rektide suggestion of permission would be one option, but maybe if not using it for cross origin stuff no need to get permission since not much different than controlling your own content anyways. However if a developer wanted to use frames, maybe could be under a permission prompt could work but UX wise not sure if it's the best... People are trained from mobile so they understand accepting to view camera or listen to their microphone - so that'd be a new thing for people to ask if a WebXR experience can load and embed external content - but even then they might not understand that the page would monitor what's done on that page. But then again you could create a remote desktop type experience and run the browser remotely but that takes a lot more effort if bad guy's were using it for anything.

Since thinking more on security, was a bit curious though how the DOM overlay would be different though. since It seems more aimed at overlaying the entire screen. I'm guessing it doesn't get put in the same texture memory as the rest of the WebGL canvas probably would be why.. I did see when interacting with cross origin stuff it'd stop sending inputs like clicks, etc.

Didn't see DOM Layers yet, but looks interesting. Seems a bit lower level closer to the metal, but could see some higher level frameworks taking advantage of it.

cabanier commented 4 years ago

Interesting on security, sounds mostly when framing content though or looking at link colors... So maybe limiting that functionality when that div is being used to map to a texture of an object's face. This could be very powerful even if only allowed first origin content.

We will have to make sure not to limit it too much; otherwise the feature will be too hard to use.

Since thinking more on security, was a bit curious though how the DOM overlay would be different though. since It seems more aimed at overlaying the entire screen. I'm guessing it doesn't get put in the same texture memory as the rest of the WebGL canvas probably would be why.. I did see when interacting with cross origin stuff it'd stop sending inputs like clicks, etc.

Correct. It's a separate pass so the 3D content content can't find out what the HTML is. Klause detailed the security mitigations in the DOM Overlay spec.

Didn't see DOM Layers yet, but looks interesting. Seems a bit lower level closer to the metal, but could see some higher level frameworks taking advantage of it.

DOM Layers are higher level. It basically allows independent surfaces to create a scene. That way, you won't be able to read (or infer data) from surfaces that have HTML content. DOM layer support is still not certain but we're certainly planning/hoping to have it.

RangerMauve commented 4 years ago

@cabanier I think your security mitigations link is broken

cabanier commented 4 years ago

@cabanier I think your security mitigations link is broken

Thanks! I fixed it