whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.11k stars 331 forks source link

Allow more flexibility in how 401s/407s are handled? #1132

Open estark37 opened 3 years ago

estark37 commented 3 years ago

About a year ago, I refactored how HTTP authentication prompts work in Chrome, and I recently learned (https://bugs.chromium.org/p/chromium/issues/detail?id=1159476) that this refactor is kind of incompatible with fetch() and Service Workers, specifically for main-frame main resources. Main resources have some special considerations when they trigger auth prompts, and it's not entirely clear how to make Chrome spec-compatible, so I wanted to start a discussion here. The tl;dr is that in Chrome we want to mitigate spoofing risks by not showing auth prompts for main resources mid-request, and this is at odds with what Fetch says.

For background, main resources are special when it comes to auth prompts because of the risk of user confusion/spoofing. If a user is navigating from a.com to b.com and the navigation triggers an auth prompt, the user might become confused and enter a.com credentials in the prompt, not realizing that they will be sent to b.com. Different browsers do different things about this risk:

Belatedly, I now realize that Chrome's new approach violates the Fetch spec. Fetch says that when a network fetch results in a 401 or 407 response, the browser should prompt for credentials and then do another network fetch with those credentials and return that second network fetch as the response -- but Chrome doesn't want to show an auth prompt in the middle of a main resource fetch. This means that if a Service Worker intercepts a main resource fetch and passes the request into fetch(), it can receive a 401 or 407 response as the result, without the user being prompted for credentials. I'm hand-waving away some implementation details here, but I think the important point is that Fetch wants the browser to show an auth prompt mid-fetch, and Chrome explicitly wants to avoid doing that for main resources to implement a clear visual distinction between the old page and the new page.

I can't really think of a way to resolve these tensions -- adhering to Fetch while mitigating the spoofing risk while avoiding all the problems that overlay interstitials gave us -- hence I'm wondering if Fetch can give browsers more latitude in how they handle auth prompts by allowing browsers to return network errors if it's not safe to obtain credentials from the user at that moment:

Another option would be to return the 401/407 response directly (rather than a network error) if it's not safe to obtain credentials from the user, but I think (?) that is more likely to have compatibility issues with existing Service Workers, because existing Service Workers might accidentally cache 401/407 responses.

(As an aside, it seems to me that Fetch doesn't currently allow for the possibility that the user declines to enter credentials, e.g. cancels the authentication prompt, so these changes would also fix that, by allowing the browser to return a network error if credentials can't be obtained from the user.)

I'm happy to send PRs if these changes are amenable, but wanted to discuss first to see if there are other options I'm not seeing.

Legion2 commented 3 years ago

How should service workers handle the network error resulting from the browser not prompting the user for username and password?

I'm affected by the bug in chrome, my page uses a service worker and is protected with Basic auth. The page can not load in chrome because all service worker requests are Unauthorized (401) and even when reloading the page chrome does not prompt for the basic auth credentials. The only workaround currently is to open a protected resource in a new tab which is not cached in the service worker to trigger the auth prompt and the reload the page with the service worker.

estark37 commented 3 years ago

How should service workers handle the network error resulting from the browser not prompting the user for username and password?

Service workers wouldn't need to handle the network error, similarly to e.g. an SSL certificate error. I suppose it could return a cached response if it wanted to, but otherwise the browser would show an auth prompt and then resend the request with credentials once provided.

annevk commented 3 years ago

I do think this is something we should standardize and align between browsers, including not handling these for subresources.

It seems like a new capability for content to be able to access 407 responses though. How are you handling that? Or you treating them as opaque in some manner?

cc @Trikolon @johannhof

estark37 commented 3 years ago

It seems like a new capability for content to be able to access 407 responses though. How are you handling that? Or you treating them as opaque in some manner?

Right now our Chrome implementation is just buggy; we return the 401/407 response to the Service Worker and we don't show the auth prompt at all. That's clearly not the right behavior. I agree that it doesn't seem quite correct to let content access the 401/407, but in practice I'm not sure it's an entirely new capability. In our old overlay model (or in e.g. Safari's current model), if the user hits Cancel on the auth prompt, then the request proceeds in order to display the response body from the server. I suspect that in this case the SW will see the 401/407 response just like it would see a 404. (I have to verify this, though.)

Even if it is the case that SWs can sometimes access 401/407 responses today, I suspect that's probably an accident of implementation and not an intentional decision. Making a 401/407 response opaque seems reasonable, I didn't think of that -- though I'd still be a bit worried about SWs caching and reusing it unintentionally. Do you think treating it as a network error would also be an option?

annevk commented 3 years ago

if the user hits Cancel on the auth prompt, then the request proceeds in order to display the response body from the server.

Ah yeah, that's a good point. I haven't played with 407 and naïvely thought that if that was declined you would end up with a network error. If they are no different from 401s the new capability is mostly getting access to them without user interaction.

we return the 401/407 response to the Service Worker and we don't show the auth prompt at all.

So, I would not expect a prompt when you hand the response to the service worker, but rather when the service worker hands the response to the browsing context navigating. Or is that what you meant?

The main risk I see with revealing 407 is that sites can figure out the user is behind a proxy. As discussed in https://github.com/whatwg/fetch/issues/1007 there might be various ways to do this already, but perhaps this gives them more bits of information? Unsure. (I don't think it's a problem for sites to access a 401, they can do so already with fetch().)

(Another somewhat-related question I have is how you handle cross-origin nested browsing contexts navigating to such responses, but maybe we should tackle that separately.)

estark37 commented 3 years ago

So, I would not expect a prompt when you hand the response to the service worker, but rather when the service worker hands the response to the browsing context navigating. Or is that what you meant?

Yes, that's what I meant. Currently we hand the response back to the Service Worker which hands it to the navigating browsing context and we never show a prompt. What I'd like to do is: 1.) Hand a network error to the Service Worker, who can hand that or a cached response to the navigating browsing context 2.) Show a prompt when the navigation commits, if it's the network error

(Another somewhat-related question I have is how you handle cross-origin nested browsing contexts navigating to such responses, but maybe we should tackle that separately.)

We'd really like to get rid of prompts from cross-origin subresources and nested browsing contexts altogether, but not sure if we'll be able to do that in the foreseeable future (https://bugs.chromium.org/p/chromium/issues/detail?id=400380#c24). In the meantime, we only use the interstitial for top-frame navigations, not nested browsing contexts, since the latter usually already has the ability to spoof content in the context of the parent page.

annevk commented 3 years ago

That sounds reasonable, except it has to be some kind of opaque response instead. We don't have a type for a network error (it's a generic exception you cannot really pass on).

estark37 commented 3 years ago

Ok, I think that should work. Later this week I'll work on a prototype to do that to fix Chrome's implementation and a PR. Thanks!

kheldorn commented 3 years ago

Has there been any update on this issue?

estark37 commented 3 years ago

Sorry, it fell off my stack. I'll try to get back to it this week.

wanderview commented 3 years ago

Would this mean creating a new filtered response type like opaqueredirect? So like a new opaqueauth or something?

estark37 commented 3 years ago

@wanderview is there a reason it'd need to be a new response type instead of just opaque?

wanderview commented 3 years ago

Currently opaque means its a cross-origin response loaded without CORS. So we have a hard block on supporting those for navigations. See step 3.2.3 here (mode will be navigate for document loads):

https://fetch.spec.whatwg.org/#http-fetch

My skim of the issue suggested to me these would be same-origin to the request url and return a 40x status code. In that regard they are very similar to navigation redirects. When redirect mode is manual, like during navigation, we return an opaqueredirect instead that allows the status code to flow to the outer navigation machinery while hiding the location header and whatnot from the SW script.

The parallels just made me wonder if we should build something similar for auth. That seems like an easier path than trying to use opaque.

But its certainly possible I misunderstood and these are cross-origin?

estark37 commented 3 years ago

I see, thanks for the explanation! I think you're probably right -- it's same-origin to the request url and returning a 40x status code, and I want to be able to hand something opaque to the SW that it can pass on, to signify that there has been a 401/407 response without the SW script being able to access the actual response.

estark37 commented 3 years ago

Update: I've fixed the glaring functional bug in Chrome 90, so that the prompt now shows when the Service Worker hands the response back to the navigating browsing context. I've also prototyped an opaqueauth response type in Chrome/Blink and sent that off to @wanderview to see if it looks reasonable; if it does, I'll work on a spec PR next.

annevk commented 2 years ago

I'm curious if there are any updates to this. It seems that if we end up returning 401/407 to attacker-controller processes #728 becomes more important.

estark37 commented 2 years ago

I'm on parental leave until next week, I'll take a look then to load this back into my brain and try to remember where I left off.

estark37 commented 2 years ago

Ok, I took a look back at what I had on this. I had a Chromium patch and a Fetch patch that look like they're in decent shape. But, to be completely honest, I remember that I discovered some problem with this approach, but now I can't remember what the problem was and of course I didn't write it down anywhere. 😫 I'm going to try to keep this in the back of my mind for the next few days and see if I magically remember what the problem was.

However, I'll note that, if I'm understanding #728 correctly, neither Fetch as currently written nor as patched above will ever return a 401/407 from HTTP-network-or-cache fetch. As currently written, HTTP-network-or-cache fetch always recurses on a 401/407 after attempting to gather credentials, and as patched above, it sometimes returns an opaque response. Does that resolve the concern in #728?

Edited to add: there might be implementations that inadvertently return 401/407s to SWs, but the spec isn't currently written to allow/condone that.

annevk commented 2 years ago

I think your last edit is what this issue is about and also what might be relevant for #728. The moment we start surfacing these responses to the website process to allow for an easier implementation of the navigate algorithm, we run into some new questions to address.

estark37 commented 2 years ago

Finally got a chance to play with this some more; sorry for the delay. Chrome does indeed expose 407 responses to the Service Worker at the moment. Haven't tested other browsers. As previously mentioned, this isn't a problem with the spec at the moment, it's a problem in Chrome's implementation.

If we go ahead with the changes referenced in https://github.com/whatwg/fetch/issues/1132#issuecomment-1139831381, such that an opaque response is returned for 401s/407s (both in the spec and the implementation), is that sufficient to resolve #728?

estark37 commented 2 years ago

Actually, re-reading #728, I think I misunderstood it; returning an opaque response to content doesn't really change anything, it's rather that CORB has to be applied to 401/407 responses.

annevk commented 2 years ago

Hmm, so maybe these responses are only surfaced for navigations? Because in that case the responses will always be same-origin with the "attacker" process and CORB/ORB don't really apply. They only come into effect for so-called "no-cors" requests. (Overall the plan still seems sound to me by the way.)