Closed hsivonen closed 3 years ago
Hi @hsivonen,
I thought I would correct a few mistakes in your post, in case it's useful for anyone else.
For reference, these are the Sec-CH-UA- headers that Browserleaks got out of Chrome 88 (non-Sec-CH-UA- Client Hints not included below but sent by Chrome: Viewport-Width, DPR, Device-Memory, RTT, Downlink, ECT). In Chrome, these appear to be enabled on Android and Chrome OS and behind a flag on Linux, Windows, and Mac.
(Note: I'm not sure why you used such an old Chrome version to test this.)
In the current release version (Chrome 91 as of today), UA-CH is enabled by default (since 89, IIRC) on all platforms. Browserleaks isn't a great site to test things with as it sends invalid client hint token names - they're missing the Sec-CH
prefix - this was a bug we fixed in M89). Maybe that's why you used 88? (Note: I sent them an email a few months back, but got no response).
Chrome 91 seems to have reduced the headers only to Sec-CH-UA and Sec-CH-UA-Mobile, however. Still, some of the comments below are based on what Chrome 88 exposed (if flag enabled).
These are the default, low-entropy UA Client Hints (aka, sent by default for all requests). M93 also adds Sec-CH-UA-Platform. If you were testing this using Browserleaks, the reason you only saw these are because that site has a bug.
Sec-CH-UA isn’t fully GREASEd: It’s the same in all cases instead of the components changing order at random or the ";Not A Brand";v="99" part varying.
This isn't quite true. In the Chromium implementation, it varies between versions. If you only looked at a single version, I can see how you made that mistake though. We probably could improve the Chromium implementation to more closely match the spec, yes: https://wicg.github.io/ua-client-hints/#create-arbitrary-brands-section.
Sec-CH-UA tries to capture both engine and “brand”, but Sec-CH-UA-Full-Version has just one place for version.
Yep, Microsoft gave some good feedback on this. I plan to address in https://github.com/WICG/ua-client-hints/issues/196.
The meaning and format of Sec-CH-UA-Platform-Version depends on Sec-CH-UA-Platform.
I'm not sure what this means. Recently thoughSec-CH-UA-Platform-Version
got some improvements to it in https://github.com/WICG/ua-client-hints/pull/245 that standardizes on format for platform version.
Sec-CH-UA-Arch is reported for Chrome OS despite arguably being pretty useless there.
Right, that's what the spec says to do:
"User Agents MUST return the empty string for model if mobileness is false. User Agents MUST return the empty string for model even if mobileness is true, except on platforms where the model is typically exposed."
Sec-CH-UA-Arch does not indicate 32-bit vs. 64-bit.
Correct, that's captured in the Sec-CH-UA-Bitness hint.
I thought I would correct a few mistakes in your post, in case it's useful for anyone else.
Thanks!
Browserleaks isn't a great site to test things with as it sends invalid client hint token names ... Sec-CH-UA-Bitness
Is there a demo site that is up-to-date and shows all Sec-CH-UA-*
headers that exist?
https://user-agent-client-hints.glitch.me/ is up to date with the proposal.
For reference, the Apple WebKit team view had previously been that it was similarly "non harmful" (on the basis that we could always lie in the same ways as the User-Agent header currently does; we would be unlikely to expose anything that we currently do not). As such, the only real advantage over User-Agent is Sec-CH-UA
allows for it to be GREASE'd, as well as in the somewhat unlikely-and-very-distant future where we could shorten the User-Agent header we send on every request to reduce bytes over the wire.
where we could shorten the User-Agent header we send on every request to reduce bytes over the wire
That could also be solved by having different User-Agent
values for different requests: A better-than-present one by default (better either as lower-entropy or shorter or both) and a traditional one in cases required by compat concerns.
As far as bytes on the wire go, if User Agent Client Hints become the norm rather than the exception, they have a lot of potential for more bytes over the wire even with header compression in newer versions of HTTP. Personally, I'm more concerned about getting stuck with both the old and the new verbosity than about the benefit of having the information be more structured.
Also, as discussed under Sec-CH-UA-Full-Version
above, I'm worried that making it too structured makes it harder to deploy the kind of compat workarounds that have served browsers well in the past: e.g. claiming to be Netscape AND Gecko AND Safari AND Chrome, like Chrome does.
AFAICT, User Agent Client Hints conflate structuring the data and putting the data behind an explicit request. The FAQ doesn't cover why defaulting to a (mostly-)frozen UA string and explicitly requesting the traditional UA string was rejected as a solution.
The meaning and format of Sec-CH-UA-Platform-Version depends on Sec-CH-UA-Platform.
I'm not sure what this means.
It means that Sec-CH-UA-Platform-Version
doesn't make sense on its own. To make sense of its meaning, you have to know what the value of Sec-CH-UA-Platform
is. The "format" part referred to the macOS value using underscores instead of periods as the version component separator.
Maybe that's why you used 88?
It was just a matter of delay from time of making the effort to record the header values on multiple devices to the time of posting.
Request for Mozilla Position on an Emerging Web Specification
Current Status
Our current position is non-harmful.
Why change?
Upon inspection, the various features of User Agent Client Hints fall into three categories:
User-Agent
header in a way that, realistically, isn't going to go away (example: theMobile;
token),Moving stuff around (from
User-Agent
toSec-CH-UA-*
) doesn't really solve much. That is, having to request this information before getting it doesn't help if sites routinely request all of it.What Chrome Does
For reference, these are the Sec-CH-UA- headers that Browserleaks got out of Chrome 88 (non-Sec-CH-UA- Client Hints not included below but sent by Chrome: Viewport-Width, DPR, Device-Memory, RTT, Downlink, ECT). In Chrome, these appear to be enabled on Android and Chrome OS and behind a flag on Linux, Windows, and Mac.
Chrome 91 seems to have reduced the headers only to
Sec-CH-UA
andSec-CH-UA-Mobile
, however. Still, some of the comments below are based on what Chrome 88 exposed (if flag enabled).On x86_64 Linux
Sec-CH-UA "Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99" Sec-CH-UA-Full-Version "88.0.4324.150" Sec-CH-UA-Platform "Linux" Sec-CH-UA-Platform-Version "" Sec-CH-UA-Arch "x86" Sec-CH-UA-Model "" Sec-CH-UA-Mobile ?0
On x86_64 Windows
Sec-CH-UA "Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99" Sec-CH-UA-Full-Version "88.0.4324.150" Sec-CH-UA-Platform "Windows" Sec-CH-UA-Platform-Version "10.0" Sec-CH-UA-Arch "x86" Sec-CH-UA-Model "" Sec-CH-UA-Mobile ?0
On aarch64 macOS
Sec-CH-UA "Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99" Sec-CH-UA-Full-Version "88.0.4324.150" Sec-CH-UA-Platform "Mac OS X" Sec-CH-UA-Platform-Version "11_2_1" Sec-CH-UA-Arch "arm" Sec-CH-UA-Model "" Sec-CH-UA-Mobile ?0
On aarch64 Android
Sec-CH-UA "Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99" Sec-CH-UA-Full-Version "88.0.4324.152" Sec-CH-UA-Platform "Android" Sec-CH-UA-Platform-Version "10" Sec-CH-UA-Arch "" Sec-CH-UA-Model "Nokia 9" Sec-CH-UA-Mobile ?1
On x86_64 Chrome OS
Sec-CH-UA "Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99" Sec-CH-UA-Full-Version "88.0.4324.153" Sec-CH-UA-Platform “Chrome OS" Sec-CH-UA-Platform-Version "13597.84.0" Sec-CH-UA-Arch "x86" Sec-CH-UA-Model "" Sec-CH-UA-Mobile ?0
Observations
Use Cases
(Quotes from this README.)
We have a couple of decades of experience of this being an anti-pattern compared to browsers making the features detectable and the sites detecting the features instead of inferring them from the browser version, because if site assumes that browser A has a feature, browser B has to pretend to be browser A in order to make the site use the feature in browser B as well.
If Web devs want to detect WebP in Safari, the requested change to Safari should be making WebP support itself detectable instead of exposing the OS version.
This one indeed is something that browsers can’t offer designed detection surface. Still, this generally needs the major version of the engine. Not “brand”, minor OS version, or such.
An interesting question is how relevant this actually is with current version uptakes for browsers and sites having resources to pay attention to users who for whatever reason aren’t updating according to the rapid release schedule. At the time sites deploy a workaround, they can’t necessarily know what future browser version won’t have the need for the workaround. Can we guarantee only retrospective use? Do Web developers care enough about retrospective workarounds for evergreen browsers?
This doesn’t require the information to be available for discriminatory decision making at the time of the HTTP request. The use case would still be addressed if the site learned the information after the fact (e.g. by being able to decrypt it only later).
Safari’s ad click attribution feature is precedent for letting sites gather statistics in a deferred way.
It is in the interest of minority browser engines and even “brands” using Chromium to treat different experiences based on browser identity rather than feature detected capability as an anti-feature.
Sec-CH-UA-Mobile as one bit of information does not seem harmful enough to oppose especially when mobile browsers have a “request desktop site” piece of UI available. However, this bit of information is already easy to extract from the old User-Agent string, which is realistically not going away even if frozen.
Facebook Year Class is cited as an example of this usage. It’s unclear what exactly Facebook varies based on this information, how common this practice is beyond Facebook or if Facebook does this at present when viewed in Chrome as opposed to doing this is the Android app.
Sec-CH-UA-Model provides a lot of identifying bits on Android and leads to having to use an iPhone in order to be part of a larger anonymity set. This is a reason to oppose to Sec-CH-UA-Model. Reporting e.g. device memory as coarse bands is less harmful and addresses the use case. CPU performance is harder to classify, but should be possible to classify coarsely.
The framework cited here varies between iOS and Android. There is probably enough fingerprinting surface to detect the Windows vs. macOS vs. non-Mac desktop nix vs. Android vs. iOS anyway, so it’s not worthwhile to hide that. However, non-Linux non-Mac desktop nix systems will probably benefit both in terms of site compatibility and in terms of privacy by claiming to be “Linux” even if they are e.g. a flavor of BSD.
The notion of a framework varying styling based on OS version seems niche enough not to justify the exposure of the OS version.
Also, you can’t infer a theme from “Linux” regardless of version. Granularity like “Ubuntu” or “Fedora” would be bad for privacy and also it seems implausible that sites would pursue the diminishing returns of trying to match distro themes. (Chrome says “Linux” when running on Ubuntu.) Notably, Android isn’t guaranteed to be themed according to pure Android, either.
Intent support no longer requires the OS version, since sites can ignore very old OS versions. However, if a new feature of this nature is introduced, it should be introduced as feature detectable instead of being inferred from OS version.
Enabling experiments like these carries risk to minority engines and Chromium “brands” with little upside from addressing this use case.
Providing this information as part of login notification is useful if accurate, but there are other interests that go against providing this information, since we can’t limit it to this application. Notably, this already causes confusion with Chromium “brands” that have to claim to be Chrome for compatibility purposes but then the login notifications say Chrome instead of e.g. Edge.
Universal Binaries make knowing “macOS” the sufficient level of detail for Mac. Windows apps typically use an installer anyway. Making a 32-bit x86 installer do the x86 vs. x86_64 vs. aarch64 detection solves this for Windows.
Notably, Sec-CH-UA-Arch sent by Chrome doesn’t distinguish between x86 and x86_64 on Windows and Linux where this is relevant, which makes the header not really useful for the stated use case and just provides fingerprinting bits.
Apps whose old versions for old systems are kept available for download for old operating systems generally work well enough by listing them all and making the user pick.
As with market share statistics, the visibility of this information could be deferred without harming the use case.
This may look like a good use case on surface, but from the user perspective what this really means is potentially being denied access because of UA information. Consider this WebKit bug.
This use case is also terrible for minority browsers trying to break into the market or trying to stay on the market. Gnome Web is not big enough for Google to care about it, and Google’s attempts to restrict logins only to latest major browsers risks shutting Gnome Web out of the most popular services on the Web.
Even if we believed that it’s a bad idea security-wise to use browsers that fork Firefox or Chromium and don’t properly keep taking security-relevant upstream changes, we should still resist accepting this use case as one we’d support, since it is directly against user choice.
This is nice to have, but concern related to privacy and Web compat risk should take precedence.
It should be clear that we aren’t treating fingerprinting as a legitimate use case even if it has spam filtering and bot detection applications.
Well-behaved bots that intentionally make themselves blockable by Web sites may continue to use the existing User-Agent header to identify themselves. There doesn’t seem to be value in trying to make them do something different now.
Comments on the Fields
Sec-CH-UA
Harmful.
Since sites tend to look at the most popular “brand”, this field has all the potential of developing the same problems as User-Agent presently. If sites start to opt into receiving this field as a matter of routine, we haven’t really improved things.
On the bright side, this field is versatile enough to put something like "Chromium";v="88", "Google Chrome";v="88", "Firefox";v="86", ";Not A Brand";v="99" in there.
In general, this seems just moving the old problem to a new place.
Sec-CH-UA-Full-Version
Harmful.
It seems that more often than not the third and fourth component of a value like "88.0.4324.150" will only serve fingerprinting purposes, and the first component is the realistic level of granularity that sites care about for bug workarounds. Also, for this to be useful for workarounds, the workaround needs to be retrospective: I.e. it stops applying when the version number gets high enough. Is deploying retrospective workarounds worthwhile with evergreen browsers? OTOH, if the workaround is for current and future versions of the browser, what’s this field used for?
This field leaves no room for fooling sites that only look for Chrome version. That is, there’s no room to show a fake Chrome version and the real Firefox version if the two diverge.
Sec-CH-UA-Platform
Not harmful, but redundant with a part of the old User-Agent string that’s not going away.
Allows for adapting to platform conventions, and the bits of entropy are probably discoverable anyway.
FreeBSD may want to say Linux and iPadOS may want to say Mac OS X, though.
Sec-CH-UA-Platform-Version
Harmful.
The realistic main use cases not served better by making OS version-dependent Web-exposed feature feature-detectable seem to be denying access in the form of vulnerability filtering and fingerprinting.
Sec-CH-UA-Arch
Harmful.
Adds a fingerprinting bit without actually addressing the use case of offering the right downloads due to not distinguishing between x86 and x86_64.
Sec-CH-UA-Model
Particularly harmful.
Lots of fingerprinting bits with dubious user-facing benefit.
Sec-CH-UA-Mobile
Not harmful, but redundant with a part of the old User-Agent string that’s not going away.
Makes one bit of information explicitly accessible, which is better than inferring it from other information.
However, it’s unlikely that we’d remove this information from the User-Agent header from which this is easier to extract than most other information there, so the value of this field as a new header is questionable.