w3c / webappsec-fetch-metadata

Fetch Metadata
https://w3c.github.io/webappsec-fetch-metadata/
Other
75 stars 28 forks source link

Alternative to `Sec-Fetch-Site: none` #20

Closed mikewest closed 5 years ago

mikewest commented 5 years ago

The current document defines a none value for Sec-Fetch-Site which we'd use when requests were caused by something outside the context of the web. Users typing URLs into address bars, for example, or clicking bookmarks.

@anforowicz suggests that we do this differently, defining these instead as cross-site requests, and adding another header that indicates that a given navigation isn't web-initiated. For example, navigating from the address bar might result in the following headers:

Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigation
Sec-Fetch-Site: cross-site
Sec-Fetch-Totally-Not-Web-Initiated-At-All: ?T
Sec-Fetch-User: ?T

I think I prefer Sec-Fetch-Site: none. This proposal is a bit more verbose, and I like the fact that adding a token to Sec-Fetch-Site forces developers to make a decision about how they want to deal with non-webby navigation.

That said, I can live with this proposal if others prefer it. WDYT?

@arturjanc I imagine this wouldn't influence y'all's deployment plans one way or the other?

anforowicz commented 5 years ago

I would very much like to keep Sec-Fetch-Site focused on origins/sites. My hidden motivation here is ease of implementation, but I consider the implementation complexity to be a smell that indicates that the spec for Sec-Fetch-Site: none is not well aligned with the underlying properties that we are trying to expose through Sec-Metadata headers.

My concerns about step 5 (*) in the current proposal:

I understand that correctly/securely serving HTML documents might require additional knowledge that might not be captured in the current Sec-Fetch-Site, Sec-Fetch-Dest, Sec-Fetch-Mode and Sec-Fetch-User. OTOH, I think that Sec-Fetch-Site as used-to-be-spec-ed (just cross-site, same-site and same-origin) is 1) non-controversial and ready to ship and 2) going to provide great value for securely serving non-HTML documents. I also note that cross-origin HTML documents should be protected to some extent by CORB (although I recognize that this is a different kind of protection than what Sec-Metadata aims to achieve). Therefore, maybe Sec-Fetch-Site can be shipped as-speced-today in Sec-Metadata-v1 and we can discuss how to capture browser-initiatedness (in a new header? in one of existing headers?) in Sec-Metadata-v2?

(*) If r is a navigation request that was explicitly caused by a user’s interaction with the user agent (by typing an address into the user agent directly, for example, or by clicking a bookmark, etc.), then set header’s value to none.

mikewest commented 5 years ago

I would very much like to keep Sec-Fetch-Site focused on origins/sites.

Wouldn't this lead you to be more favorable to none? That is, what's the "site" for the address bar? cross-site doesn't seem accurate, as there's nothing to compare against. :)

it seems a bit ad-hoc (why only "navigation request" and not downloads [save-page-as currently refetches AFAIK] and/or prefetches [i.e. html trying to prefetch/precache the "next" link] and/or other-non-web-initiated-requests [like traffic related to omnibox search suggestions])

Taking these questions independently:

it makes Sec-Fetch-Site deal with more than just sites/origins

As above, I think I just disagree here. It's precisely because we care about origins (and we don't have one in this case) that none feels more appropriate than cross-site.

it is not fully defined

That's fair! But this applies equally to the counterproposal, doesn't it? Wouldn't we need to run through the same set of questions to determine the value of Sec-Fetch-Totally-Not-Web-Initiated-At-All?

Therefore, maybe Sec-Fetch-Site can be shipped as-speced-today in Sec-Metadata-v1

As specced today, we do the logical equivalent of a null-dereference in step 4.2 of https://mikewest.github.io/sec-metadata/#abstract-opdef-set-site (I meant to put the existing step 5 above that block).

@arturjanc has told me that in the absence of knowledge about the user's involvement in the navigation, sending Sec-Fetch-Site: cross-site would not be deployable. I'm not convinced that sending same-site is a great idea.

I think we need to get this worked out in v1. Punting it doesn't seem viable, as this bit seems like the core of a number of deployment scenarios.

arturjanc commented 5 years ago

I agree with @mikewest above. First, putting aside the question of how exactly to express the "was not web-initiated" information, it seems important to define/include this in v1. Otherwise, if some browser-initiated requests appear as site=cross-site (see e.g. crbug/868286) it will lead to breakage in applications which want to prevent external navigations, but don't want to reject browser-initiated traffic, which we can generally assume to be safe.

As to the actual method of surfacing this information, I think Sec-Fetch-Site: none most closely matches my mental model here. For example, as a developer of a sensitive application, I'd like to be able to write code such as: if req.sec_fetch_site and req.sec_fetch_site == "cross-site": # Reject or redirect to "/"

Basically, site captures important information about the provenance of the request; everything web-initiated is covered by same-origin, same-site, cross-site, but since servers have to deal with browser-initiated requests for which the concept of a requesting origin does not apply, having an explicit value to identify them seems fairly clean.

If we include equivalent data in another header, this would also be workable for developers, but it would lead to more complex server-side logic. For example, if browser-initiated traffic is decoupled from site and has the Sec-Fetch-Non-Web: ?T header, developers would have to look in two different places every time they want to identify the source of the request. I guess another possibility would be to remove Sec-Fetch-Site altogether for browser-initiated requests but that's also somewhat awkward.

As an aside, I'm a bit surprised that this hasn't come up in the context of SameSite cookies, which seem to need the same concept.

anforowicz commented 5 years ago

It's precisely because we care about origins (and we don't have one in this case) that none feels more appropriate than cross-site.

Wouldn't this lead you to be more favorable to none? That is, what's the "site" for the address bar? cross-site doesn't seem accurate, as there's nothing to compare against. :)

As specced today, we do the logical equivalent of a null-dereference in step 4.2 of https://mikewest.github.io/sec-metadata/#abstract-opdef-set-site (I meant to put the existing step 5 above that block).

Well, in my mind, browser-initiated navigations have an opaque initiator (maybe this is a wrong intuition, but it so far it seemed to match what the code does in practice).

I think we can agree that about:blank frames get their origin from the initiator of the navigation, right? I've tried to navigate to about:blank from omnibox in Chrome, Firefox and Safary and then looking at window.origin in DevTools console - in all 3 cases it said "null" rather than "none". It might be fine to introduce "none" as a distinct value from "null", but it seems to me that this should be done consistently (i.e. the new value should not only be used for computing Sec-Fetch-Site but also should be visible in window.origin and/or considered in things like SameSite cookies).

I see that net::URLRequest::initiator() returns a base::Optional<url::Origin> so there is a distinction between "none" and "null" at that level. OTOH, I don't think this distinction was made in specs and/or made web visible (although I hear this distinction is exposed through some WebRequest extension APIs as pointed out by @naskooskov).

Can you be more specific about "other non-web-initiated requests"? My intuition is that many things you're talking about don't actually run through Fetch.

One example I had in mind was search suggestions for omnibox.

That's fair! But this applies equally to the counterproposal, doesn't it? Wouldn't we need to run through the same set of questions to determine the value of Sec-Fetch-Totally-Not-Web-Initiated-At-All?

Yes - the counterproposal would also need to be fully defined. By proposing alternatives I hoped to 1) keep Sec-Fetch-Site small and 2) shippable earlier.

I think we need to get this worked out in v1. Punting it doesn't seem viable, as this bit seems like the core of a number of deployment scenarios.

First, putting aside the question of how exactly to express the "was not web-initiated" information, it seems important to define/include this in v1. Otherwise, if some browser-initiated requests appear as site=cross-site (see e.g. crbug/868286) it will lead to breakage in applications which want to prevent external navigations, but don't want to reject browser-initiated traffic, which we can generally assume to be safe.

Would it be fair to say that the concern above is mostly limited to HTML content?

Also - can Sec-Fetch-Site: none be approximated by observing Sec-Fetch-Site: cross-site, Sec-Fetch-Mode: navigate and Sec-Fetch-Dest: document? Are you concerned about web-initiated navigations in pop-ups or main frames?

mikewest commented 5 years ago

Well, in my mind, browser-initiated navigations have an opaque initiator (maybe this is a wrong intuition, but it so far it seemed to match what the code does in practice).

That isn't my mental model. In particular, I think there's a real difference between navigations initiated from within <iframe sandbox="allow-top-navigation"> and navigations initiated from the address bar. It seems critical to me to allow sites to treat the former with quite a bit of suspicion, and the latter with less.

One way of allowing sites to do that work would be to send Sec-Fetch-Site: cross-site for both, and to send an additional header to demarcate the address bar's navigation (Sec-Fetch-Totally-Not-Web-Initiated-At-All: ?T).

Another way of allowing sites to do that work would be to send Sec-Fetch-Site: cross-site for the sandboxed frame, and Sec-Fetch-Site: none for the address bar.

Both would work. I prefer the latter, as I noted in the initial comment in this bug, because I like the fact that adding a token to Sec-Fetch-Site forces developers to actively make a decision about how they want to deal with non-webby navigation, and therefore seems more likely to me to result in reasonable experiences for users who type things into the address bar and use bookmarks.

the new value should not only be used for computing Sec-Fetch-Site but also should be visible in window.origin

I disagree. Sec-Fetch-Site is not an origin, but an enum. It does not need to match the value of window.origin for about:blank navigation.

and/or considered in things like SameSite cookies

For SameSite cookies, we poked at this use case by talking about the request's client (in step 1 of https://tools.ietf.org/html/draft-ietf-httpbis-rfc6265bis-02#section-5.2), considering browser-initiated navigations (which have a null client AFAIK) as same-site for the purposes of cookie delivery.

One example I had in mind was search suggestions for omnibox.

I don't follow the risk here: Chrome generates requests in response to the user's interaction with the omnibox. It seems like we'd deliver those requests with Sec-Fetch-Site: none under the current proposal. Can you help me understand the cross-origin disclosure risk we'd create (or fail to mitigate) in that scenario?

Also - can Sec-Fetch-Site: none be approximated by observing Sec-Fetch-Site: cross-site, Sec-Fetch-Mode: navigate and Sec-Fetch-Dest: document

I don't think it can. The critical information is whether the navigation was initiated by the browser or by a website. We need to deliver that information in some way, as discussed above.

Are you concerned about web-initiated navigations in pop-ups or main frames?

Yes.

anforowicz commented 5 years ago

I think I withdraw my objections to having Sec-Fetch-Site: none request header in v1 of Sec-Metadata - thank you for bearing with me and patiently responding to my concerns.

I still think that we need to define more carefully when Sec-Fetch-Site: none should be present - we should explicitly consider the expected behavior for various scenarios (bookmarks, open-in-new-tab, dropping a bookmark or a link onto the tab strip, etc.). I've tried to enumerate relevant scenario is a document here.

One example I had in mind was search suggestions for omnibox.

I don't follow the risk here: Chrome generates requests in response to the user's interaction with the omnibox. It seems like we'd deliver those requests with Sec-Fetch-Site: none under the current proposal. Can you help me understand the cross-origin disclosure risk we'd create (or fail to mitigate) in that scenario?

Under the current proposal search suggestions traffic would get no Sec-Fetch-Site: none, because this traffic is not related to navigations (and the current proposal only asks for Sec-Fetch-Site: none in navigation requests).

We can go either way here, but we should be explicit. If we say that no request header is needed for search suggestions traffic, then the server won't be able to distinguish search suggestions traffic coming from browsers that don't support Sec-Metadata VS traffic from browsers that do support Sec-Metadata. This seems slightly undesirable, but maybe this is okay.

mikewest commented 5 years ago

I think I withdraw my objections to having Sec-Fetch-Site: none request header in v1 of Sec-Metadata - thank you for bearing with me and patiently responding to my concerns.

Great! Thanks for pushing on this. I appreciate the work you're putting into helping us flesh out the threat model and clarify our definitions!

Under the current proposal search suggestions traffic would get no Sec-Fetch-Site: none, because this traffic is not related to navigations (and the current proposal only asks for Sec-Fetch-Site: none in navigation requests).

I misunderstood your scenario, thanks for the clarification. I thought you were referring to navigational prefetches (e.g. the user agent thinks it's really likely that you're going to visit the URL you've typed in the omnibox, and loads the HTML file before you hit "Go"). You're actually talking about browser-initiated requests for data based on user input. Similar questions would arise for other parts of Chrome that load data (Safe Browsing, for instance).

I think user agents have a lot of freedom here, as those requests don't have much of anything to do with HTML or Fetch. I agree with you that we should define a reasonable set of rules internally for the way we represent those requests, and if we can do so consistently with other vendors, so much the better!

@arturjanc: WDYT about this scenario?

I've tried to enumerate relevant scenario is a document here.

This looks very helpful! Thank you. I'll see what I can do about documenting these scenarios in some sort of "Implementation Considerations" section in the spec. That might help vendors align on behavior, even if we can't test much of it via WPT.

arturjanc commented 5 years ago

Under the current proposal search suggestions traffic would get no Sec-Fetch-Site: none, because this traffic is not related to navigations (and the current proposal only asks for Sec-Fetch-Site: none in navigation requests).

You're actually talking about browser-initiated requests for data based on user input. Similar questions would arise for other parts of Chrome that load data (Safe Browsing, for instance).

It seems the most consistent to me to also send Sec-Fetch-Site: none on non-navigational requests made internally by the browser. I think we focused on browser-initiated navigational requests because it was easier to recognize that developers need a way to handle them, but it makes a lot of sense to apply the same logic to other browser requests.

In practice, this will likely only matter for a few endpoints across a small number of entities from which browsers fetch data (search autosuggestion providers, Safe Browsing, etc.) so -- while we should certainly cover it in the spec -- I don't necessarily expect a lot of breakage potential here.

mikewest commented 5 years ago

I'm going to close this out, as I think we've agreed on Sec-Fetch-None, and I've added some non-normative text to the spec around the kinds of considerations user agents need to make when considering navigation triggered by non-webby activity.

I'm happy to add more detail to that section, or clean up anything missing from the none description, but let's file new issues to cover specific concerns.

Thanks for pushing on this, @anforowicz! I think we're in a much clearer state because of your feedback!