mozilla / standards-positions

https://mozilla.github.io/standards-positions/
Mozilla Public License 2.0
650 stars 72 forks source link

Storage Access Headers #1084

Open cfredric opened 4 weeks ago

cfredric commented 4 weeks ago

Request for Mozilla Position on an Emerging Web Specification

Other information

The Storage Access Headers proposal creates new HTTP request and response headers to enable authenticated embeds to access third-party cookies, even without an iframe, via existing storage-access permission grants.

bvandersloot-mozilla commented 5 days ago

This proposal is interesting, and makes a lot of sense for subdocuments in particular.

For Gecko, this would be the only (non-heuristic) way to get access to unpartitioned cookies for non-iframed resources, so by including other subresources, there are a lot of additions this implies for us.

A few questions:

Would this end up merging into whatwg/fetch?

Why does allowed-origin exist? The client must send the Origin header in the request so the server should be parsing and handling that appropriately.

In a similar vein, is it possible to remove the retry value entirely by having the server send a 307 redirect to the same resource?

I'm also still not certain that more tightly integrating with CORS is a bad idea. "Then this would mean the embedded site would be required to allow the top-level site to read the bytes of its responses and response headers," isn't necessarily true depending on the definition of the CORS protocol. Opacity is within that algorithm's control. And you are already mandating the use of the Origin header for requests with Sec-Fetch-Storage-Access: inactive, so in some sense you are already integrating with CORS :). I think by not making this explicit up front, there is a risk of running into corner cases down the road.

cfredric commented 2 days ago

Hey Ben, thanks for taking a look!

Would this end up merging into whatwg/fetch?

Yes, ideally; I'm still working on the spec (aiming to have something out ASAP), and it builds on top of Fetch, Storage Access API, and Fetch Metadata.

Why does allowed-origin exist? The client must send the Origin header in the request so the server should be parsing and handling that appropriately.

Briefly, it exists for the same reason the Access-Control-Allowed-Origin header exists. The server ought to parse and handle the Origin header appropriately, but people will take shortcuts if they can, and/or make mistakes. It's safer for the browser to require an explicit, accurate, and informed signal of opt-in. (Note that wildcards are supported, so this shouldn't be an overly onerous requirement if the site really doesn't care.)

In a similar vein, is it possible to remove the retry value entirely by having the server send a 307 redirect to the same resource?

If we got rid of the retry token and used a 307 instead, that would change the semantics of preexisting 307s out on the web, and make them suddenly less secure than they'd otherwise be (when third-party cookies are blocked).

I understand the desire to reuse a preexisting mechanism instead of inventing a new one, but we're talking about a signal that relaxes a security boundary. That really requires us to invent something new, otherwise we'd unnecessarily regress the security of whatever existing thing we piggyback on. That would have bad consequences for existing sites out on the web, which are avoidable if we create something new.

I'm also still not certain that more tightly integrating with CORS is a bad idea.

I think if we were starting from square 0 and the web hadn't been using CORS for almost 20 years already, then it would make sense to design a single thing that provides both this feature and what CORS provides today. But it's impractical to try to redesign CORS at this stage and make it do something that none of the web's server deployments planned for. I designed SAH so it could be dropped into the existing web, with minimal change/breakage for any sites that don't use it; I think that is a more practical and more useful property than being tightly integrated with CORS.

And you are already mandating the use of the Origin header for requests with Sec-Fetch-Storage-Access: inactive, so in some sense you are already integrating with CORS :)

The Origin header already carries precisely the meaning we want to convey; its meaning/semantics are not changing through our additional integration with it, so there's no point in inventing something new here. (In contrast, the rest of CORS does carry a different meaning from the API we want to add, and it is already used in places where SAH should probably not be; and an integration with SAH would indeed change the meaning of Access-Control-Allow-Origin, say.)

bvandersloot-mozilla commented 1 day ago

Why does allowed-origin exist? The client must send the Origin header in the request so the server should be parsing and handling that appropriately.

Briefly, it exists for the same reason the Access-Control-Allowed-Origin header exists. The server ought to parse and handle the Origin header appropriately, but people will take shortcuts if they can, and/or make mistakes. It's safer for the browser to require an explicit, accurate, and informed signal of opt-in. (Note that wildcards are supported, so this shouldn't be an overly onerous requirement if the site really doesn't care.)

That's a great argument. I think I'm sold there :)

In a similar vein, is it possible to remove the retry value entirely by having the server send a 307 redirect to the same resource?

If we got rid of the retry token and used a 307 instead, that would change the semantics of preexisting 307s out on the web, and make them suddenly less secure than they'd otherwise be (when third-party cookies are blocked).

I think I wasn't clear. If you give no special power of granting storage access to the 307. Instead, I'm saying that "retry" is pretty ambiguous and that there are 2 similar-looking but different things in browsers right now. A self-redirect via 307 + Location header and Refresh: 0. I suppose it could be syntactic sugar for one of these, but we should specify which. It makes sense to me for it to be the network level, but that gets into layering issues that I'm not sure I would understand between HTTP semantics and Fetch.

I'm also still not certain that more tightly integrating with CORS is a bad idea.

I think if we were starting from square 0 and the web hadn't been using CORS for almost 20 years already, then it would make sense to design a single thing that provides both this feature and what CORS provides today. But it's impractical to try to redesign CORS at this stage and make it do something that none of the web's server deployments planned for. I designed SAH so it could be dropped into the existing web, with minimal change/breakage for any sites that don't use it; I think that is a more practical and more useful property than being tightly integrated with CORS.

And you are already mandating the use of the Origin header for requests with Sec-Fetch-Storage-Access: inactive, so in some sense you are already integrating with CORS :)

The Origin header already carries precisely the meaning we want to convey; its meaning/semantics are not changing through our additional integration with it, so there's no point in inventing something new here. (In contrast, the rest of CORS does carry a different meaning from the API we want to add, and it is already used in places where SAH should probably not be; and an integration with SAH would indeed change the meaning of Access-Control-Allow-Origin, say.)

Sure, I agree with this. I think we may have different definitions of "integrating". In Fetch, a CORS request is defined as a HTTP request that includes an Origin header, even if it doesn't participate in the CORS protocol. I don't assume that integrating with CORS means these requests participate in the CORS protocol. I just assume that because both touch the same header and the server will be making decisions on credentials, that they should be thought of together in design and documentation to prevent weirdness and rough edges. Perhaps their use will have less overlap than I expect so this would be a non-issue. But on the other hand Access-Control-Allow-Credentials seems pretty strange as a mechanism without storage access integration IMO. Has Anne given any thought to this?

This discussion also gave me another realization: the Origin header isn't sent for non-cors-mode GET requests for compatability constraints. That means that for subdocument loads or image loads to use SAH without top-level document changes, you need to change the rules for when the Origin header is sent, potentially affecting compat. Have you given this thought?