mozilla / standards-positions

https://mozilla.github.io/standards-positions/
Mozilla Public License 2.0
643 stars 69 forks source link

Add new safelisted schemes for registerProtocolHandler() #339

Open fred-wang opened 4 years ago

fred-wang commented 4 years ago

Request for Mozilla Position on an Emerging Web Specification

Other information

In the past years, the web developer community have requested to add new schemes to registerProtocolHandler() but decisions have been blocked on [blocklist] on which some Mozilla members had concerns. Instead, this proposal is about extending the safelist.

Some Chromium members also had more specific concerns on new schemes which have not been addressed yet by reporters [geo] [version-control]. The three proposals listed above is about extending the safelist for requests that seemed uncontroversial in past discussions. More specifically:

  1. Those related to decentralized technologies: "ethereum", "dat", "dweb", "ipfs", "ipns", "ssb", "cabal" and "hyper". Note that the cryptocurrency "bitcoin" is already listed. It seems Mozilla had interest in these technologies in the past e.g. in [mozilla-webextension], [mozilla-hacks-dweb] or [mozilla-libdweb]. Note that one mild concern is what happens if some of these decentralized protocols are implemented natively. See https://github.com/whatwg/html/pull/5482#issuecomment-628017832

  2. Schemes used to encode credentials or id as an URI: "otpauth" and "doi". At least for the latter @annevk said Mozilla would accept patches for it: https://github.com/whatwg/html/pull/3080#issuecomment-629042367

Another concern raised during WHATWG review is the need to ensure these schemes are documented somewhere, which has been addressed by registering all of them at [iana].

[blocklist] https://github.com/whatwg/html/issues/3998 [geo] https://github.com/whatwg/html/issues/2546#issuecomment-418376741 [iana] https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml [mozilla-hacks-dweb] https://hacks.mozilla.org/category/dweb [mozilla-libdweb] https://github.com/mozilla/libdweb [mozilla-webextension] https://bugzilla.mozilla.org/show_bug.cgi?id=1428446 [version-control] https://github.com/whatwg/html/pull/1829#issuecomment-418594058

annevk commented 4 years ago

For 1, a question came up, how would these single-origin-HTTPS-based gateways for distributed technology ensure content is suitably isolated (security-wise)?

fred-wang commented 4 years ago

@annevk I guess it will depend on the implementation. I asked this to @lidel (main developer of the IPFS companion extension) some time ago and this was his reply for IPFS:

Historically we did redirect to the same Origin: ipfs://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq → http://127.0.0.1:8080/ipfs/bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq/ It lacked sandboxing between websites, so we now support Origin isolation per content identifier on the localhost subdomain gateway: → http://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.localhost:8080/ Origins are still tied to gateway host, but at least each content root gets own sandbox.

ekr commented 4 years ago

I'm not quite sure I follow. Can you provide an example of the registerProtocolHandler() call that would be made in this case?

lidel commented 4 years ago

@ekr I imagine we will prefer to point users at a public IPFS2HTTP gateway, to ensure URIs work even if user does not have a local node running.

Given the public gateway at dweb.link, registerProtocolHandler call for IPFS could look similar to:

navigator.registerProtocolHandler("ipfs",
                                  "http://dweb.link/?uri=%s",
                                  "IPFS handler");

Opening http://dweb.link/?uri=ipfs%3A%2F%2Fbafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq%2Fwiki%2F would redirect (HTTP 301) to: https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.dweb.link/wiki/

This gateway approach provides unique Origin per IPFS content identifier (CID) for regular users, and lets us do opportunistic protocol upgrade for power users: request will be redirected to local IPFS node if user installed our browser extension, or handled natively if a browser vendor decides to ship with a built-in node in the future.


ps. note we already enforce Origin isolation if user tries to access content the old, "single origin way": https://dweb.link/ipfs/bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq/wiki/

@annevk Hope this answers your Origin concerns.

martinthomson commented 4 years ago

So this relies on the provider of the handler following good hygiene. In the example given, this is achieved by using unique per-IPFS-origin origins (it probably also requires that dweb.link is on the PSL and some other stuff).

I think that the general concern is that if the target were to be incautious in any way, the content from mutually distrustful sources might end up sharing a single origin in some way. That might be suboptimal.

As the underlying design here assumes that we can pass all responsibility for a scheme to a single web endpoint, that means that the choice of endpoint is crucial. Is there any way in which we might instead build something with better inherent safety properties. For instance, IPFS has an authority component, which implies something analogous to origin exists for that scheme. All of that is opaque to the browser and so we end up in this situation where maybe dweb.link could be safe, but cool.dweb.example could be much less so.

Maybe, rather than having a simple safe-list, we could teach the browser how to identify origins for multiple schemes. Then it has some hope of being able to maintain boundaries, even if the content source does not.

fred-wang commented 4 years ago

Sorry, I think I added some confusion... What I mentioned yesterday was related to current discussions for Web Extensions. AFAIK, this is already an issue for Mozilla's protocol_handlers as it whitelists some of the dweb protocols mentioned in 1.

But AFAIK for HTML pages, registerProtocolHandler() only allows to register a protocol handler from the same origin, so I don't understand why the "sharing a single origin" actually happen in that case?

fred-wang commented 4 years ago

Hi, is there any update on this?

Someone from Microsoft proposed the "did" protocol which I think can be added to the list of safelisted decentralized schemes discussed here. See https://bugzilla.mozilla.org/show_bug.cgi?id=1639016 and https://github.com/whatwg/html/issues/5561

The Chromium API owners also supported this proposal: https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/7nHTRUP1EGY/3tzL1tFgAwAJ

dbaron commented 4 years ago

So in addition to the concerns around origins, I think there's a distinct point here that by supporting extending this list of exceptions (rather than saying that web+ protocols should be used) we could be seen as effectively endorsing this list of technologies, some of which may be quite harmful to users in various ways. So essentially this seems like ten distinct review requests (eight for the "decentralized schemes" one, and then the two others). In fact, even barring that, I think the other issues raised probably need to be looked at separately across the ten protocols here.

I'd also note that the points in https://github.com/mozilla/standards-positions/issues/339#issuecomment-630517916 aren't related to registerProtocolHandler's origin restrictions; they're about ensuring separation of the content reachable through the protocol into multiple origins, not about what handler can be registered to handle it.

fred-wang commented 4 years ago

So in addition to the concerns around origins, I think there's a distinct point here that by supporting extending this list of exceptions (rather than saying that web+ protocols should be used) we could be seen as effectively endorsing this list of technologies, some of which may be quite harmful to users in various ways. So essentially this seems like ten distinct review requests (eight for the "decentralized schemes" one, and then the two others). In fact, even barring that, I think the other issues raised probably need to be looked at separately across the ten protocols here.

If that can help, I'd suggest you first look at the related ipfs / ipns protocols which you already whitelisted for extensions: https://searchfox.org/mozilla-central/source/toolkit/components/extensions/schemas/extension_protocol_handlers.json#19

fred-wang commented 4 years ago

The change (related to decentralized technologies) has landed in Chromium and a tentative WPT test added.

aschrijver commented 4 years ago

FYI: Just learned in openEngiadina chatroom that Chrome beta implemented a bunch of this too: https://blog.chromium.org/2020/09/chrome-86-improved-focus-highlighting.html (halfway down the page)

olizilla commented 3 years ago

it probably also requires that dweb.link is on the PSL – https://github.com/mozilla/standards-positions/issues/339#issuecomment-630517916

*.dweb.link is on the Public Suffix List https://github.com/publicsuffix/list/blob/cde0cd4275cfd5c50f45e4e0146c70be3036c935/public_suffix_list.dat#L13022-L13024

In other news, registering a custom protocol handler for ipfs now works in the stable Chrome release

navigator.registerProtocolHandler("ipfs",
                                  "http://dweb.link/?uri=%s",
                                  "IPFS");

I'm working on creating the landing page for dweb.link to let the user register that protocol handler. It would be great to land support for this in Firefox too.

OR13 commented 3 years ago

https://www.w3.org/2021/09/21-did10-minutes.html

Someone should leave a review over here: https://github.com/whatwg/html/pull/5482

Also, consider adding a statement on environmental impact to all registered protocol handlers before allowing them to be added to here: https://html.spec.whatwg.org/multipage/system-state.html#safelisted-scheme

OR13 commented 2 years ago

See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1490386

jgraham commented 2 years ago

In general I believe scheme handlers that are to be added to the registerProtocolHandler safelist should meet the following criteria:

I don't think all existing safelisted schemes meet the above criteria. That isn't a reason to relax the criteria, but may be a reason to consider pruning entries from the existing list that aren't useful in practice.

In any case I agree with the previous concerns that a single issue covering multiple schemes is unlikely to work well, unless there's some reason to believe that the schemes are fundamentally coupled so that a decision for one will be a decision for all.

OR13 commented 2 years ago

The new #msdt 0-day can be mitigated by removing the protocol handler for ms-msdt (reg delete hkcr\ms-msdt /f).

A timely note on dangers of protocol handlers.

martinthomson commented 2 years ago

So @OR13 highlights the potential value of separate review for new protocol handlers. That is, they are not inherently safe to invoke from untrusted contexts (see also, the recent Zoom RCE).

But that isn't exactly what we're talking about here. Invoking any system handler is something we already do as a browser. This naturally creates an exposure to that sort of risk. At some level, it is like downloadable binaries, except that you can only exploit those binaries already installed.

This is why we have gates on following links - apps that handle these links are often not safe when exposed to the web in this way - but these gates don't require the same rigour as we might use for downloading and running software.

registerProtocolHandler on the other hand is about establishing a means to handle actions on web pages, which is in a lot of ways safer. For one, you don't leave the sandbox. The risks to users are more grounded in the fact that the site handling the URI is in a position to intermediate all actions related to that type of URI (or at least top-level navigations). That is, they get to see all of those URIs.

How a handler acts from that position of power is important. As noted, it is up to handlers/gateways to enforce rules about the URIs being resolved on their own. We've previously noted that origin isolation is done on their own terms, which has consequences for URIs of that type. Some handlers will do the right thing, but there is no inherent guarantee of that.

My personal view on this is that this gatekeeping we're doing here at the level of the scheme is not helpful or necessary. The fundamental problem here is that the design of the scheme itself has no bearing on how it might be implemented by an arbitrary gateway or handler. We can't know if it is safe. dweb.link might do the right thing, but that doesn't mean that all IPFS handlers will. What a good site does therefore has no bearing on how we decide, except to the extent that it is a demonstration that it is possible to do good things.

But that too is not that important. As a browser, we need to respect browser choice, so while we might throw up some questions to ensure that a choice is deliberate, it doesn't make sense to stop a scheme from being passed to the OS (where it might be mistreated, as noted) or to a site that the user has chosen to use. Either is a valid choice. What we've done though is hobble the in-browser option, even though it is safer than throwing it to the OS. The safelists and other restrictions, while they might sound good, don't necessarily do users any favours if less safe options are their only recourse. So while the criteria James lists sound excellent, I don't think that they are helping the cause.

We should probably limit our involvement to blocking handlers or schemes that either cannot be safely implemented at all, or have shown to be consistently unsafe to the point that blocking them might be necessary to protect many users. And lose the safelist.

I see that we now have #644 for ipfs/ipns. If we conclude that no safelist is needed, then that might be moot, but I'll leave it open there as it is a well-formulated issue on which we might be able to conclude more quickly than this bigger discussion.

annevk commented 2 years ago

I think that analysis is skipping over the how do end users deal with this question. With well-established features such as mailto it's relative straightforward to envision UI to ask the user if a certain website can take on the task of handling email addresses (you probably wouldn't even put mailto in the UI). That's a lot less clear if mailto becomes an arbitrary string. (Of course, this is a pre-existing flaw with the web+ schemes, but I'm not sure why we'd build on that further.)

martinthomson commented 2 years ago

Yeah, I do downplay the significance of the question. That is because the alternative is a dialog that is equally incomprehensible, but leads more directly to an outcome that is possibly a lot worse (sending the URL to the OS and a native app). So you can see why I might take that shortcut.

annevk commented 2 years ago

That always happens (modulo some newish dialogs) if the user hasn't granted a website access to a scheme. Allowing websites to register for more schemes won't really address that problem in any meaningful way I think.

(Edit: your use of "alternative" is a bit confusing as one doesn't lead directly to the other. Registering an "unknown" scheme and clicking a link with an "unknown" scheme are different interactions and both need suitable flows.)

OR13 commented 2 years ago

Registering an "unknown" scheme and clicking a link with an "unknown" scheme are different interactions and both need suitable flows.)

+1 to this framing.

This reminds me of the similar position regarding web serial apis, the objection argument there was that:

Since we don't know what this API will do, how can we ask the user for consent?

... this must be exactly how a web browser feels right before in opens a URL for the user... except the browser probably knows the scheme of the URL.... and is guessing there are no active 0 days that can be exploited which would make the "known scheme" do "unknown" things.

martinthomson commented 2 years ago

That distinction is helpful, yes. But let's take a step back and examine what happens when a URL that is natively unknown to the browser is presented. There are three cases:

  1. The OS can handle it. The browser generally shows a prompt, offering the user an option to pass this off to some application. If a previous choice has been made regarding this (which Firefox now binds to the site as well as the type of URL), then the choice is bypassed.
  2. The OS can't handle it. Firefox at least usually fails to navigate at this point, effectively doing nothing.
  3. A handler can be installed whereby the browser can navigate to a web page that has been registered to handle the URL. Usually, this doesn't involve any prompting once the handler is installed; the browser just runs snprintf on the URL template (e.g., https://handler.example?url=%s) to produce a new HTTPS destination to navigate to.

We only get into this last state because the user was presented with a choice previously about the URL scheme. My assertion here is that we already have user engagement around following links to URLs of unknown schemes. The choice to turn the link into a HTTPS one is inherently safer for a user's security than a choice that might end up sending messages to native applications.

There are differences are in how these choices are designed, presented, and so forth. These do matter. In Firefox at least, the choice is made once for registering a URL handler. This makes that choice a much more significant decision than it might be if the "handle this URL in X" question were presented when following a link. But this remains something that browsers can iterate on without necessarily engaging in specifications.

The safelist on the other hand is an imposition in specification. It says that only these schemes are safe to implement, but it only applies that to implementations that are also on the web. If you implement them natively, it says, go ahead. That native implementation is routinely where the risk lies. Managing that risk is left to a simple dialog and user choice.

A web implementation of a scheme could also present a dialog, but that seems to be delegated to the safelist, with the effect being that we have to carefully vet everything. I don't see how that does anyone any favours.

annevk commented 2 years ago

The choice to turn the link into a HTTPS one is inherently safer for a user's security than a choice that might end up sending messages to native applications.

Given that some operating systems seem to rely on URL schemes to dispatch between native applications (macOS and iOS have been doing this since forever, see https://developer.apple.com/documentation/xcode/defining-a-custom-url-scheme-for-your-app and https://medium.com/@contact.jmeyers/complete-list-of-ios-url-schemes-for-apple-apps-and-services-always-updated-800c64f450f to get a sense of how widespread this is) I'm not so sure that letting websites hijack those relationships is always going to be better for end users.

aredridel commented 2 years ago

This absolutely sounds like something that needs user consent — and designing to meaningfully solicit (and make revocable) that consent has always proven underwhelming. We're starting to see designs for such things (providing context somewhat, and revokability buried a bit in a website) with Oauth2 screens. I'd love to see those things developed further for an in-app context for managing changes like this.

martinthomson commented 2 years ago

@annevk

operating systems seem to rely on URL schemes to dispatch between native applications

Ah, that's the source of the disconnect. I was only considering the potential for registerProtocolHandler() to govern URL activation within the browser and only for top-level navigations. I definitely agree that allowing a site to interpose on inter-app communications in the OS is undesirable.

To the extent that a safelist contributes to allowing a site to interact with URIs at the OS level, then I agree that we need this sort of vetting. Creating a distinction between the this more comprehensive access (OS-level) and way that registerProtocolHandler() functions (top-level navigations) is probably necessary.

FWIW, the spec isn't particularly clear as it relates to this distinction. I love this bit: "User agents may, within the constraints described, do whatever they like."

javifernandez commented 2 years ago

I find very useful the definition of the 'safelist' concept given by @jgraham in this comment. I couldn't find it in the spec; is it deliberately ? I wonder whether there is enough consensus about it to consider its inclusion in the spec.

I think it'd make easier to resolve on going and future issues about adding or removing schemes from the list.

domenic commented 2 years ago

https://html.spec.whatwg.org/#safelisted-scheme

javifernandez commented 2 years ago

https://html.spec.whatwg.org/#safelisted-scheme

Yes, I meant that it'd be useful to add there a clear definition of the requirements to be in that list. I thought that the definition given by @jgraham in the comment above could be a good start.

domenic commented 2 years ago

Oh. The requirements for modifying that list are given at https://whatwg.org/working-mode#changes .

zcorpan commented 8 months ago

Suggest defer as per https://github.com/whatwg/html/issues/9158