Open toji opened 5 years ago
any news on this after the webxr specs are stable?
i need this to update my App to webxr on the Occulusbrowser
Now that WebXR has been out for a while, I think we can revisit this issue. We've heard from many developers that they would like the ability to navigate while in VR.
As a first pass, we (at Oculus) would like to explore how we can get same origin navigation to work. We believe this will satisfy most use cases and it has far fewer security issues than cross origin navigation.
What are the uses for same origin navigation? It seems like that is already being done, we just keep the context and change the content.
What are the uses for same origin navigation? It seems like that is already being done, we just keep the context and change the content.
I suspect it will make it easier to host multiple WebXR sites this way because they won't have to rely on a common framework. We have asked several WebXR developers and were told that limiting navigation to same origin would be very useful.
Same origin would be better than nothing, but I hope we can do cross origin navigation at some point. I can't believe we are this close to litteraly building the Metaverse, but refrain because of potential spoofing.
Same origin would be better than nothing, but I hope we can do cross origin navigation at some point. I can't believe we are this close to litteraly building the Metaverse, but refrain because of potential spoofing.
I understand that it's not what most people really want. By going this approach, we'll have a basis (spec, tests, framework support) on which we can build cross origin navigation.
Same origin navigation on as Default would be perfect! mostly now on chrome webxr on Desktop and edge for MixelReality MS Headsets
/agenda What are the concerns about allowing same origin immersive navigation?
@thetuvix I believe you expressed some concerns without having this enabled with a prompt for same origin navigation. Can you elaborate your concern?
I'm in if we believe this UX is indeed a subset of what we'll expose to folks for true cross-origin navigation!
My primary privacy concern would be if we get folks used to this no-confirmation same-origin navigation UX for two years, and then later add in with-confirmation cross-origin navigation - if someone was maliciously trying to spoof navigation, they could then just show the old no-confirmation experience, and I expect most users would think that's how it's meant to be.
As an alternative, if we make today's same-origin navigation UX involve some reasonable confirmation (e.g. tap the UA-reserved System button to proceed), we can get users used to something that cannot be spoofed. That doesn't mean that we don't add other privacy protections later when we add legit cross-origin navigation (e.g. privacy sigils, etc.), but it starts us off with some level of unspoofable baseline to train users on.
Cross origin navigation has so many pitfalls that we should treat it as an unsolved problem. We already know that confirmation dialogs don't work because users blindly click through. I suspect we will need to establish a new ecosystem to establish trust.
We know these risks are absent for same origin which is why I proposed to just focus on that.
/agenda Should we have a confirmation prompt to allow same origin immersive navigation?
(Leaving the agenda tag til next meeting)
On the call today, @cabanier asked me to share my thoughts here around whether UAs should have a "warning" on navigations, even when they are same-origin.
Here's what I'd said above back in April:
I'm in if we believe this UX is indeed a subset of what we'll expose to folks for true cross-origin navigation!
My primary privacy concern would be if we get folks used to this no-confirmation same-origin navigation UX for two years, and then later add in with-confirmation cross-origin navigation - if someone was maliciously trying to spoof navigation, they could then just show the old no-confirmation experience, and I expect most users would think that's how it's meant to be.
As an alternative, if we make today's same-origin navigation UX involve some reasonable confirmation (e.g. tap the UA-reserved System button to proceed), we can get users used to something that cannot be spoofed. That doesn't mean that we don't add other privacy protections later when we add legit cross-origin navigation (e.g. privacy sigils, etc.), but it starts us off with some level of unspoofable baseline to train users on.
I definitely don't believe UAs should add any kind of warning or scary language around same-origin navigation today - that would certainly lead to a "crying wolf" permissions fatigue that would make any actual warnings later lost in the noise. In particular, note that the security risks here aren't the sort that a UA-rendered warning could even fix for cross-origin navigation later: The primary risk with cross-origin navigation is the site pretending to navigate but not actually navigating - in that case, the UA isn't in a position to render any UX, warning or not, since the malicious site is still controlling all the pixels back in the first WebXR session.
What I'm advocating UAs consider for same-origin navigation today is a lightweight, unspoofable confirmation action, for example, the user tapping the UA's reserved menu button on the controller when on the interstitial UA screen showing the new URL to complete the navigation. The goal is to train users to build muscle memory around a certain flow the 100 times they do legit same-origin navigation today (e.g. pressing the UA menu button to confirm), so that the 101st time when a page is trying to spoof a different origin, they follow the same muscle memory and the spoof is revealed (e.g. they press the UA menu button as usual, but the UA system menu pops up instead, revealing the scam).
An example of this kind of confirmation action on existing devices is the iOS Apple Pay flow, where you always double-tap the side button to confirm the payment before Face ID or manual PIN entry kick in. This trains users that they should always expect to do that trusted action before entering their device PIN, helping prevent malicious sites from phishing a PIN the user may have reused elsewhere:
A confirmation action isn't a complete solution by itself - for example, even for same-origin navigation, the browser still needs an interstitial screen that clearly shows the target URL's domain before confirmation - otherwise, users today could end up following a malicious link to another same-origin page and think they actually did navigate to their bank's domain. However, without some unspoofable action or other unspoofable aspect to that interstitial screen, there is no way for the user to know if the interstitial screen and its displayed domain are actually real. (it's unlikely that real-world users will understand that their UA has a same-origin limitation)
Note that a confirmation action also isn't the only way to have some unspoofable aspect to the navigation flow. Another option that Mozilla has talked about would be for the interstitial navigation UX to show some user-specific "totem" - an icon, model or background that is particular to that user and therefore a malicious page wouldn't have the knowledge to spoof. This would make the malicious navigation look different than all the legit navigations the user has performed before, and hopefully cause them to pull up before giving up their login info.
An example of this kind of "totem" on existing devices is the iOS task switching UI that appears when you swipe up from the bottom of the screen. This reveals your list of recent apps, but also a blurred version of your wallpaper and home screen icons - that's not something any given app or site has the per-user knowledge to spoof, and so it gives you confidence that you actually swiped from far enough down on the screen and are not still typing sensitive data into the previous app/site:
All of this comes down to a given UA's particular navigation UX - certainly, we are early enough in the immersive web that I expect lots of exploration here among browsers to find the right UX that both keeps users secure and enables awesome experiences! This discussion is less about prescribing any particular kind of UX and more about helping us all reason through the problem space we're solving for (high-confidence navigations), so we can start out secure here, mitigating those user risks present even with the same-origin restriction and training users well for the future!
What I'm advocating UAs consider for same-origin navigation today is a lightweight, unspoofable confirmation action, for example, the user tapping the UA's reserved menu button on the controller when on the interstitial UA screen showing the new URL to complete the navigation.
A reserved button on the controller definitely sounds like something that would help avoiding spoofing. It should not entirely be relied upon though because this user action would be taken in response to a visual prompt from the app/site, and as users will get used to this prompt, they will all react to it with the same average response time. It would be easy for a malicious app/site to make it look like it got a user confirmation after this average response time, which would trick almost all users.
Maybe users should be attributed a personal short "morse code", like "two short taps, little pause, long tap", and the UA upon waiting for the confirmation action should display some kind of UI that shows when the reserved button is pressed, so it becomes more clear when the malicious app/site does not actually know if and when the reserved button is pressed. This UI could also display the user-defined or randomly generated totem. Example :
Granted the memorization of a morse code would be tedious, but the UA could give visual clues, which would make spoofing all the more recognizable :
spoofing version :
A confirmation that includes more than a simple tap/double-tap may fail the "lightweight" test - users may then consider it a chore to navigate between pages, and this will likely have lower accessibility. Striking the right balance will be key to users accepting this kind of confirmation.
One critical aspect of using a reserved UA button for confirmation is that the button needs to also have some visible function when not doing navigation. That way, if a page tries to spoof users into tapping the reserved UA button to complete a fake navigation, that button's default function will activate (e.g. switching to the system menu) and the ruse will be revealed.
Examining the analogy to Apple Pay here, iOS requires you to double-tap the side button as the confirmation gesture before entering your PIN for Apple Pay. Apps cannot spoof that side button tap to trick you into revealing your PIN because tapping the side button otherwise will cause the phone to lock. This allows a simple double-tap to serve as confirmation - even though a malicious app knows exactly what the gesture they want to spoof is, they still can't accomplish the spoofing without a visibly different user experience that reveals the scam.
Tapping a reserved button (or doing a reserved gesture) seems like a good compromise. The spec currently states that there has to be an "escape" button from WebXR. Would there be a risk to use that button/gesture or would we have to reserve another one?
This would satisfy the security requirements for cross origin navigation. I still think that it's not needed for same origin.
The spec currently states that there has to be an "escape" button from WebXR. Would there be a risk to use that button/gesture or would we have to reserve another one?
I think that might be the preferred button in most cases, as long as you can reliably differentiate between when the user is attempting to navigate and when they're not. It satisfies Alex's requirement that "the button needs to also have some visible function when not doing navigation."
In the case of the Oculus browser, this would be the app button, right? So typically pressing the app button takes you out of XR and plops you back on the 2D page. During navigation (or possibly other trusted UA interactions) it would presumably act as the confirmation button to a modal "do you wish to do this trusted action?" dialog. If the page tries to spoof a navigation and the user hits the app button, as they've been trained to do, they're suddenly looking at the 2D browser rather than seeing your bank's spoofed metaverse portal (or whatever the Wall Street kids call it these days). Sounds like exactly the kind of behavior we want!
In the case of the Oculus browser, this would be the app button, right?
Yes. I think we call it the menu button. The Oculus button blurs the scene and brings up the AUI.
Sounds like exactly the kind of behavior we want!
Do you think it's also needed for cross origin (for consistency)?
Yes, consistency would definitely be beneficial to the users in this case.
Yes, consistency would definitely be beneficial to the users in this case.
We talked about this internally and it's unclear what the benefit of warning the user about a safe action would be. I agree that going to a different origin must show an unspoofable dialog but same origin has no danger. Moreover, it's actually detrimental to show too many dialogs to the user because they will end up ignoring them.
As an aside, we should work on a better way to do navigation. The current proposal works but it results in a suboptimal user experience with black loading screen. I think we can do better :-)
@toji tweeted about this last week: https://twitter.com/Tojiro/status/1463233850788642818 It seems that there was broad consensus to offer an interstitial skybox or model to improve the navigation experience.
/agenda is it time to write up a spec for navigation?
Glad to see this is still actively being discussed. Just wanted to leave some links from the past here, since I didn't seem them mentioned already:
* Research based on a user study on traversal, trusted UI and sigils: https://arxiv.org/pdf/2011.03570.pdf
I actually tracked this down over in #11, wasn't too much discussion on it though.
To come back to this after a long hiatus :-), I have seen instances of WebXR sites that "navigate" while in immersive mode. Sites can change the name of the displayed url as long as it's same origin without doing a full on navigation.
Since this is already possible, why throw up barriers for an real same origin navigation? To the user, there would be no difference between the two.
/agenda discuss same origin navigation (again)
@dmarcos in last week's meeting, we agreed that same origin immersive navigation should be allowed. Do you want to draft a spec? I'd be happy to help out.
Nice to see this happening, many devs told us that they are definitely interested in this, specially gaming sites which would love to let users move from one game to another without exiting the WebXR session.
@dmarcos in last week's meeting, we agreed that same origin immersive navigation should be allowed. Do you want to draft a spec? I'd be happy to help out.
I'd also be interested in helping to get an initial draft up. Would love to get some more work going on this sooner rather than later.
@dmarcos in last week's meeting, we agreed that same origin immersive navigation should be allowed. Do you want to draft a spec? I'd be happy to help out.
I'd also be interested in helping to get an initial draft up. Would love to get some more work going on this sooner rather than later.
Anyone can create a draft document :-)
Hey all! This was brought up in a meeting between the Immersive Web chairs and editors today and we felt it was a good idea to provide an assessment of what we see as the next steps for this proposal are.
First and foremost, we need to ship WebXR. This is kind of obvious but worth stating anyway, since if we can't ship the core API none of the rest of this matters. Thankfully we're getting pretty close to that point, but as we approach the point of shipping we're going to likely be under pressure to remove or defer features that aren't critical to the APIs baseline functionality.
Once we have the core API in the wild and mostly settled, though, we'll hit a point pretty quickly where we say "what's next?" It's expected that our initial pass of follow up features to the core will include features that were under development in core previously but deferred in favor of shipping faster, addressing feedback from developers to patch functionality gaps, and pulling in well vetted proposals. Honestly, this proposal in particular could be considered to fit all three of those categories, so it's extremely likely to make the "WebXR 1.1" spec (or however we designate it) cut.
In the meantime it's not out of the question that individual browser implementations could start experimenting with adding this, though we would recommend that it's visibility be limited somehow to indicate that it's not fully specced.
Thank you everyone for your thoughtful discussion on this topic and the well considered API proposal. In my eyes this is a textbook example of exactly what the CG and proposals process should produce, and the only unfortunate part is that the core API isn't far enough along to receive it.
TL;DR: This proposal is great and will be first in line for consideration after the core WebXR spec has been wrapped up.