WICG / ua-client-hints

Wouldn't it be nice if `User-Agent` was a (set of) client hints?
https://wicg.github.io/ua-client-hints/
Other
590 stars 77 forks source link

Overly prescriptive request header fields #284

Open ronancremin opened 2 years ago

ronancremin commented 2 years ago

As originally envisaged in HTTP 1.0, HTTP 1.1 etc., the User-Agent header "consists of one or more product identifiers, each followed by zero or more comments which together identify the user agent software and its significant subproducts."

This open-ended mechanism allowed user agents to decide how much detail to reveal about subproducts. The example cited in the RFCs illustrates this usage, where the user agent choses to identify itself, its version and details of a constituent library:

User-Agent: CERN-LineMode/2.15 libwww/2.17b3

The UA-CH proposal by comparison is tightly prescriptive in the semantics of what user agents can chose to reveal about themselves. The Sec-CH-UA headers focus on a very narrow set of predefined fields leaving no natural place to express details such as constituent libraries, components and their versions.

This appears to result in a reduction of usefulness. The User-Agent header is currently used constructively to communicate limitations and capabilities of the user agents. A proposed replacement should not remove functionality that user agents chose to avail of.

miketaylr commented 2 years ago

Hi @ronancremin.

The User-Agent field will still exist, and UAs (be they browsers, bots, scripts) are still free to send them - this spec makes no requirements on what it should look like.

The latest IETF RFC that defines User-Agent (https://datatracker.ietf.org/doc/html/rfc7231#section-5.5.3) has some useful SHOULD-level guidance on what to send and what not to send.

The User-Agent header is currently used constructively to communicate limitations and capabilities of the user agents. A proposed replacement should not remove functionality that user agents chose to avail of.

Can you give an example of what you're describing here?

ronancremin commented 2 years ago

In descriptions of UA-CH it is generally pitched as a partial or full replacement for the current User-Agent header e.g.

"User agents SHOULD deprecate usage of the User-Agent header by reducing its information granularity or removing the header entirely, in favor of the Client Hints model described in this document. " - from https://wicg.github.io/ua-client-hints/#user-agent

"Eventually, the information in the User-Agent string will be reduced so it maintains the legacy format while only providing the same high-level browser and significant version information as per the default hints." - from https://web.dev/user-agent-client-hints/

But much current active usage of the User-Agent header includes granular information about libraries, subcomponents and so on (as specified by RFC7231) that doesn't fit into the list of headers defined by the UA-CH proposal:

If UA-CH really is to become the mechanism for controlling granular details exposed by user agents it should better accomodate current use cases. If not, we will end up with a situation where both the standard RFC7231 User-Agent headers and proposed UA-CH headers are in used in perpetuity, resulting in a significant increase in complexity for developers and publishers.

The intended scoping of this proposal is possibly germane. https://github.com/WICG/ua-client-hints/issues/219

miketaylr commented 2 years ago

Hi @ronancremin, thanks for the examples. AFAICT, all of these could be covered by the existing client hints, were these UAs interested in sending them. It seems like a lot of things could just end up in the brand list, or the full-version list.

Here's an example,

Whatever this is: QPExoPlayer/3.0.2.03152-4485c52179 (Linux;Android 5.1.1; AFTB) ExoPlayerLib/2.9.4, could be configured to send the following hints, if they thought it made sense (in addition to their User-Agent header).

Sec-CH-UA: “QPExoPlayer”; v=3, “ExoPlayerLib; v=2” Sec-CH-UA-Platform: “Android” Sec-CH-UA-Platform-Version: “5.1.1” Sec-CH-UA-Model: “AFTB” Sec-CH-UA-Full-Version-List: “QPExoPlayer”; v=3.0.2.03152-4485c52179, “ExoPlayerLib; v=2.9.4”

ronancremin commented 2 years ago

In my mind the spec's usage of the brand list was to allow a user agent to include several brands for itself e.g. "Google Chrome" and "Chromium" or "Microsoft Edge" and "Chromium" as currently seen in active use.

Using a brand list to communicate constituent components or libraries feels awkward, or a dissonantly named header at the very least. I don't think anyone looking at the ExoPlayerLib/2.9.4, libhttp/1.000 or libwww/2.17b3 tokens in a user-agent strong thinks "brands" in the normal sense of the word. These are components (or "significant subproducts" to use the term from rfc7231) more than they are a particular name or identity, and they map awkwardly into a "brand list" IMHO.

miketaylr commented 2 years ago

Agree - I wouldn't consider "libhttp" to be a brand. We do allow an "equivalence class" to be in the brand list, however. I don't see much of a difference between "libhttp/1.000" or "libwwww/2.17b3" and "Gecko 100" or "Chromium 100".

ronancremin commented 2 years ago

The spec defines equivalence class as representing:

" …a group of browsers believed to be compatibile (sic) with each other. A shared rendering engine may form an equivalence class, for example."

The intent of this seems to be a very specific and narrow use case around compatibility and not at all semantically similar to communicating significant subproducts.

If the brands list idea was named differently e.g. "significant subproducts" and described more generally I think it would all make a lot more sense, and more fully incorporate usage patterns suggested in the original RFCs as well as those seen in the wild today.