WICG / client-hints-infrastructure

Specification for the Client Hints infrastructure - privacy preserving proactive content negotiation
https://wicg.github.io/client-hints-infrastructure
Other
61 stars 26 forks source link

Empty Accept-CH being used to clear Client Hint requests #155

Closed nicjansma closed 1 year ago

nicjansma commented 1 year ago

Hi everyone!

As we (Akamai) are rolling out support for Client Hints, we're exploring some of the edge cases around the spec and current Chromium implementation to make sure our infrastructure behaves as desired.

One edge-case(?) we've been looking at is related to the behavior of sending a "blank" Accept-CH: header. In practice, sending a blank Accept-CH: should be used for the server to indicate to the client that it should stop sending any (non-default) Client Hints. A blank Accept-CH: acts as a "reset".

(Though I can't find any documentation explicitly stating that behavior -- maybe it should be mentioned in this repo?)

However, we think there are some scenarios that this behavior may not work as desired when multiple parties are involved (e.g. a CDN and their customer), so wanted to discuss and consider some alternatives.

Background

Being a CDN for our customers means we're going to potentially have Client Hint Accept-CH requests being generated, or passing through us, by at least 3 sources:

The Akamai CDN edge servers could then inspect headers from those three sources, and merge them into one coherent Accept-CH list for the browser. Or, it could do nothing, and the browser would see 3+ Accept-CH: lines.

For most scenarios this is fine -- the CDN or browser would take the union of all of the requests.

The edge case that we may encounter is what we should do when one of those sources (e.g. Customer Origin) requests a Client Hint "reset" (blank Accept-CH:), while others don't. The blank header would indicate that the origin (which possibly doesn't know about the other Accept-CH being generated at the edge) wants all hints to be reset.

Example

Let's pretend our Customer, for some reason, wants to send a black Accept-CH:, i.e. for privacy, or debugging, or to "reset" all visitors. They do this at Origin, because that's easy for them to control, or even because they are multi-CDN.

Customer Origin sends this (requesting a reset of CHs):

Accept-CH:

Akamai CDN also adds these (requesting CHs for mPulse):

Accept-CH: sec-ch-ua-platform-version
Permissions-Policy: ch-ua-platform-version=("*" self)

If the Akamai CDN does nothing, it would blindly include all headers to the browser, and this would result in:

Accept-CH:
Accept-CH: sec-ch-ua-platform-version
Permissions-Policy: ch-ua-platform-version=("*" self)

Per the current Chrome behavior, and HTTP header processing RFCs, the multiple Accept-CH lines are merged (including the blank one). Internally it would look something like this:

Accept-CH: sec-ch-ua-platform-version,
Permissions-Policy: ch-ua-platform-version=("*" self)

Similar scenarios might include Accept-CH: (blank) being configured by the Customer at the Edge, when the Origin is already sending them. Maybe it's two teams from the Customer not being aware of what the other is doing? Either way, the result is "conflicting" directions, because the Accept-CH header is logically being used for both a "set" and "reset" command.

Effects

For the above scenario, for current Chrome, the combining of headers (per HTTP spec) results in an "invalid" Accept-CH and so NOP (previous visitor's Accept-CH cache will remain the same as before, and new visitors will not get any Accept-CH cached).

I would argue this is what neither the Customer Origin want (reset all CHs) nor the Akamai CDN want (add Platform-Version).

Proposals

It seems like when there are potentially multiple parties involved trying to send hints, a blank Accept-CH: should be treated specially, maybe even trump other Accept-CH requests? Since "set" and "reset" are using the same header name, and header-merge behavior causes the blank Accept-CH "reset" request to be invalid when combined with the other same-named-headers, some undesirable edge scenarios like the above might happen.

We could, of course, deal with this at the edge/CDN. If we're generating+merging CHs from multiple places, we could just detect if any one of them are empty, and decide to treat it as the highest- (or lowest-?) priority. i.e. send only Accept-CH: (blank) (highest priority) and remove all other Accept-CH lines. But then you're putting that logic decision into the hands of the CDN, and every CDN would need to know to deal with this, and prioritize (or not) clearing, and try to be consistent with each other.

Alternatively, the "reset" or "clearing" of Client Hints could be moved to another explicit header, like Accept-CH-Clear: true. If both Accept-CH-Clear: true and Accept-CH: ... are in the same response, the spec could indicate that the browser should ignore other Accept-CH headers and only apply Accept-CH-Clear: true. We could also use a special Accept-CH: reset token, or Clear-Site-Data: clientHints.

So to summarize, some options we've brainstormed:

But maybe this is all too edge-casey to matter! I'm really not sure how frequently clearing Client Hints will be used. Let us know your thoughts.

yoavweiss commented 1 year ago

^^ @arichiv @miketaylr

arichiv commented 1 year ago

Accept-CH (like Critical-CH) is an sf-list, and those are only valid to split across multiple lines as long as each line has at least one token: https://www.rfc-editor.org/rfc/rfc8941.html#name-lists

The current behavior, ignoring all Accept-CH headers if you send a blank one and others that aren’t blank, is correct unless we define our own sort of list and merging logic.

This is noted in the spec currently: “There MAY be multiple Accept-CH headers per-response and sf-lists can be split across lines as long as each line contains at least one token.”

arichiv commented 1 year ago

I could see moving to option 3 or 5 potentially, @miketaylr for thoughts.

We technically shouldn’t ever be asking for an empty sf-list anyway, so it would be more compliant to have an independent header and ignore the empty case entirely (or slate it for deprecation).

miketaylr commented 1 year ago

Option 5 makes sense to me semantically, though the Clear-Site-Data spec is in a bit of a sad unmaintained state right now (someone on my team had volunteered to pick that up, but it never happened). I also wonder if other vendors would object since none of them support CH yet...

arichiv commented 1 year ago

I could take a look and consider taking it up. Will get back within a week or so.

arichiv commented 1 year ago

Have been thinking this over, and I realized that if the user deletes cookies from the UI we clear client hint data but that (as least as far as I can see) when cookies are deleted via this header that doesn't happen. It seems like we want to consider: (1) adding a clientHints option to Clear-Site-Data (2) updating the cookies option to also clear client hints

@yoavweiss for thoughts

yoavweiss commented 1 year ago

I think that clearing either cookies or cache using Clear-Site-Data should clear the CH cache as well (and tbh, I thought we did that already).

miketaylr commented 1 year ago

I think that clearing either cookies or cache using Clear-Site-Data should clear the CH cache as well

I like this idea.

nicjansma commented 1 year ago

Sounds reasonable to us as well!

Would Accept-CH: (empty) still continue to work (clearing all hints), either as an "official" way of clearing Hints, or as a not-official-but-it-just-has-that-side-effect of it being an empty list?

I would recommend (and can offer a PR, if desired) this repo have a dedicated section in the docs about Clear-Site-Data and the official (and/or not recommended) Accept-CH: way of clearing hints.

arichiv commented 1 year ago

There aren’t plans to deprecate or remove the empty Accept-CH method.

I’m already working on the first part of the PR here: https://github.com/w3c/webappsec-clear-site-data/pull/74

arichiv commented 1 year ago

https://groups.google.com/a/chromium.org/g/blink-dev/c/lJY86eTPQ0s the proposal is under review

arichiv commented 1 year ago

Closing out as this is implemented in chrome M117 by default.