WICG / ua-client-hints

Wouldn't it be nice if `User-Agent` was a (set of) client hints?
https://wicg.github.io/ua-client-hints/
Other
583 stars 74 forks source link

GREASE is not well thought and will be circumvented due to necessity #156

Open proton-ab opened 3 years ago

proton-ab commented 3 years ago

GREASE itself seems like a good idea, however I strongly believe that it will be quickly and swiftly circumvented due to necessity.

Imagine case where I would like to show user list of active sessions, with information on what kind of browser they're using. I can realistically show Chrome 86 / Chromium 86, but I can not show user '"Not\A;Brand / Chrome 86 / Chromium 86. This means that either of these two cases must take place:

Given these two scenarios, it seems obvious to me that GREASE does not achieve what it sets out to do.

amtunlimited commented 3 years ago

The purpose of GREASE is sort of two-fold. First is to encourage clients to use a fleshed parser for the header instead of the half-baked/fragile parsers you see for the User Agent string now. This doesn't seem to be a problem in you scenarios, but is worth pointing out why the weird characters exist in the fake brand.

The second is to discourage whitelisting (blocking or downgrading "unknown" user agent brands), which it goes for in two ways:

  1. Giving a common project name along with the most specific name (e.g. Chromium w/ Edge, Brave, Vivald, etc.) to let websites be as generic as possible (instead of knowing all of the Chromium derivatives, it can just look for "Chromium" and roll from there). This does require website owners to be a little bit more understanding, but I think the trade off of more traffic available is worth it.

  2. It's not explicitly stated that those characters have to be there, that's just the example implementation. There are other strategies, such as adding other established brands or making up a believable looking brand.

It's number two that we're hoping will stop the circumvention you're referring to.

It's also the hope that these behaviours (ossifying lists of brands and databases of header values maps) will go away because they'll be unnecessary. In your analytics scenario, most analytics platforms either have an "other" already or simply drop unknown UA strings, which is bad for small browsers. With the proposed solution you can say

And I think customers will be able to tell what's up

proton-ab commented 3 years ago

I did not present an analytics scenario; I have presented a scenario in which website (for example facebook) shows user list of their logged-in sessions so they can log them out: chrome_2020-11-04_22-03-28

And unless one of the scenarios I proposed are implemented by such site, it's gonna look like this: chrome_2020-11-04_22-04-00

I think this is gonna be a hit against smaller sites. It's obvious that sites like facebook will just implement whitelist of recognized brand names and keep it updated, but smaller sites do this too and they may not have enough manpower to update it continuously.


Secondly, I fail to see how GREASE can stop your described behavior of blocking of browsers. Realistically, I can still look for /chrome?(?:ium)?/i and serve worse or broken content to browsers that don't pass this, or show a message encouraging to update browser to Edge if /edge/i is not found. So, please explain - how does this solve anything?

There are other strategies, such as adding other established brands or making up a believable looking brand.

How is that different from Mozilla, (KHTML, like Gecko), Chrome, Safari? Browser will still be forced to lie as for their brand, by saying Chrome / aCuteBrowser / Chromium / Firefox / '"Not\A;Brand.

Also, if browser advertises itself as Firefox 82 / Chromium 86 / Chrome 86 / '"Not\A;Brand, how are we supposed to know which browser is real (and why?) and if we aren't, then what is the purpose of even including this header. Realistically, even on manual inspection it's impossible to say which browser it is, unless you also require full version header and compare it to some mapping list - at which point GREASE is again defeated. And while browser can technically refuse to send full version header to website, a website might in turn assume bad intentions and just serve broken content or outright refuse to serve any content (analogous to anti-adblock)

proton-ab commented 3 years ago

It's not explicitly stated that those characters have to be there

But it is, right here:

When adding arbitrary values to brands, user agents MUST make sure that receivers of the header adhere to Structured Header parsing, by adding escaped double-quotes, commas and semi-colons to those values. The purpose of this is to make non-compliant server implementations immediately aware that their parsing code is inadequate.

quasicomputational commented 3 years ago

FWIW, on the analytics thing, I opened an issue with related thinking (#115) and now a PR (#197).