mrajah1 commented 1 year ago

User Agent strings currently contain Mobile to indicate a hand held mobile device. As we consider form factors like TV, VR head sets, automotive and more it would be very useful to know what form factor the customer is browsing a website on so webapp owners can fine tune the experience.

Eg. a VR experience might want to make things more friendly for hand gestures, automotive larger touch targets

Could we consider a new Client Hint for this purpose?

miketaylr commented 1 year ago

Thanks @mrajah1 - I think it's a proposal worth considering. I'd like to gather up some data on existing patterns and report back in the next few days - I know that various browsers have shipped some form of TV, VR and Tablet tokens have shipped at some point in different UAs.

miketaylr commented 1 year ago

miketaylr commented 2 weeks ago

I'd like to... report back in the next few days

😅

OK, anyways, here's some non-exhaustive data to motivate a use case:

Sources:

Cars/ Automobiles

Example token values: “QTCarBrowser”,

Example UA strings:

Mozilla/5.0 (X11; GNU/Linux) AppleWebKit/601.1 (KHTML, like Gecko) Tesla QtCarBrowser Safari/601.1

It seems like people also just look at the model name, probably because there are so few browser-enabled cars right now (Tesla being a notable exception) - but that's not super scalable.

Television

Example token values: “tv”, “TV”, “SMART-TV”, “SmartTV”, “Smart TV”, “Smart_TV”, “HbbTV” Non-exhaustive brand-specific token values that contain TV: “GoogleTV”, “webOSTV”

Example UA strings:

Mozilla/5.0 (SMART-TV; Linux; Tizen 2.4.0) AppleWebkit/538.1 (KHTML, like Gecko) SamsungBrowser/1.1 tv Safari/538.1
Mozilla/5.0 (SMART-TV; X11; Linux armv7l) AppleWebkit/537.42 (KHTML, like Gecko) Chromium/25.0.1349.2 Chrome/25.0.1349.2 Safari/537.42
Mozilla/5.0 (SMART-TV; LINUX; Tizen 4.0) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 TV Safari/537.36
Mozilla/5.0 (Web0S; Linux/SmartTV) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.34 Safari/537.36 DMOST/1.0.1 (; LGE; webOSTV; WEBOS4.1.0 04.10.45; W4_lm18a;)
Mozilla/5.0 (SMART-TV; Linux; Smart TV) AppleWebKit/537.36 (KHTML, like Gecko) Thano/3.0 Chrome/98.0.4758.102 TV Safari/537.36
Mozilla/5.0 (Linux; GoogleTV 3.2; NSZ-GS7/GX70 Build/MASTER) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.77 Safari/534.24
Mozilla/5.0 (Web0S; Linux/SmartTV) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.34 Safari/537.36 DMOST/1.0.1 (; LGE; webOSTV; WEBOS4.6.0 03.60.02; W4_m3r;)
Mozilla/5.0 (Linux; Android 7.0; Smart_TV Build/NRD90M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/51.0.2704.91 Safari/537.36

VR

Example token values: “VR”, “MobileVR”, “Mobile VR”

Example UA strings:

Mozilla/5.0 (Linux; Android 10; Quest 2) AppleWebKit/537.36 (KHTML, like Gecko) OculusBrowser/13.0.0.2.16.259832224 SamsungBrowser/4.0 Chrome/87.0.4280.66 VR Safari/537.36
Mozilla/5.0 (Linux; Android 9; SM-N960F) AppleWebKit/537.36 (KHTML, like Gecko) OculusBrowser/6.2.11.181027543 SamsungBrowser/4.0 Chrome/74.0.3729.182 Mobile VR Safari/537.36
Mozilla/5.0 (Mobile VR; rv: 63.0) Gecko/63.0 Firefox/63.0

Tablet

Example token values “Tablet”

Note: Firefox for Android (from 11 to 68) had a “Tablet” token (but then removed it)

Mozilla/5.0 (Android 4.4; Tablet; rv:41.0) Gecko/41.0 Firefox/41.0
Mozilla/5.0 (Tablet; rv:41.0) Gecko/41.0 Firefox/41.0
Opera/9.80 (Windows NT 6.1; Opera Tablet/15165; U; en) Presto/2.8.149 Version/11.1
Evidence of a “RIM Tablet OS” token.

Not a lot of Tablets follow this pattern these days.

What else? Smart watches? Smart speakers? Not sure.

miketaylr commented 1 year ago

https://groups.google.com/a/chromium.org/g/blink-dev/c/0Bctfvd-Sg8/m/e9Mq_TrJBgAJ reminded me of "Web XR" browsers, i.e. https://hackmd.io/@XR/xrbrowsers

I don't have any of these Microsoft devices to test UA strings, but it seems like "XR" is sufficiently different than "VR" and would be included as a valid value.

So maybe if we decide to add this client hint, the list of bikesheddable values (to begin with) would be something like "Automobile", "Tablet", "TV", "VR", "XR". And we could add additional values as use cases popped up.

miketaylr commented 1 year ago

Not sure if we actually want "Tablet"... or if we should remove it.

mrajah1 commented 1 year ago

Automobile or Automotive?

miketaylr commented 1 year ago

@mrajah1 which do you think is better? I don't have any strong feelings - it's just a string to me.

mrajah1 commented 1 year ago

prefer Automotive.

woody-li commented 1 year ago

The "mobile" boolean field only for phone mobile. So I think "Tablet" is needed for check pad device.

Sora2455 commented 1 year ago

Can anyone provide a use case for why you would need to know server-side that the device is a tablet etc? Wouldn't existing data hints like screen size and data saver cover those?

woody-li commented 1 year ago

I think it's more useful for browser client. Detect current device and rendered as the specific view. Such as "Tablet": It don't exist any offical standard API to detect, its screen resolution become larger than some desktop monitor.

Sora2455 commented 1 year ago

Why do you need to know client-side that the current device is a tablet? You know what the screen size is, what the screen resolution is - what more do you want to know?

miketaylr commented 1 year ago

Why do you need to know client-side that the current device is a tablet? You know what the screen size is, what the screen resolution is - what more do you want to know?

Yeah, I have this question as well, I'm not 100% sure why Tablet is as useful as Automotive or TV... and the fact that Firefox and Chrome (and iPad...) don't send it anymore suggests it's not so important. But I'm open to other use cases.

djmitche commented 1 year ago

This list feels like an invitation to bikeshed. Are phablets included? How should my Internet-enabled refrigerator identify itself? What about the browser embedded in my tractor - is that automotive or a tablet?

I think this could be headed off by clearly defining

What each of these strings means as a signal to the application (so maybe "Tablet" means "handheld, touch-oriented device that is larger than a mobile device and not typically carried constantly"?)
What the criteria are for inclusion in this list (to head off that new Internet-enabled air-fryer start-up from trying to get "Air-Fryer" added to the list, and the inevitable "but it's just an oven" debate)

arichiv commented 1 year ago

Maybe we want to go the other direction and embrace the chaos a bit. I think Sec-CH-UA-Form-Factor and Sec-CH-UA-Model should be an sf-list and not sf-string. My guess is that this field will be the best place to cram-down compatibility histories. For example:

To start, many websites want to support iPads so they want browsers to report the model iPad and the form tablet to know to render as such. Some sites have a generic tablet support but others added iPad specific features.

Then, Apple buys Nintendo introduces the iPadDS with two screens that folds like a clamshell. Now its browser will send a model iPadDS, iPad and form factors tablet, tablet-split. Microsoft introduced a competitor so many sites support the tablet-split form factor but iPadDS support is more rare. The iPad and tablet indicators are still sent because support for either is better than nothing.

Finally, Apple introduces iPad3DS with 3D support and an AR mode. The browser now sends a model iPad3DS, iPadDS, iPad as well as the form factors tablet, tablet-split, AR. The rendered site now can pick which display modes it can accommodate best and whether or not to display that AR button.

Is this a little bit of a mess? Yes. If we don't provide a space for a mess will a mess just be made in an unexpected place (maybe via many client hints like Sec-CH-iPad3DS-ARv2)? Also yes.

djmitche commented 1 year ago

I agree - that seems the more common approach in the history of the web, and takes the spec out of the feedback loop of webdevs trying to generalize and device manufacturers trying to specialize.

cpeterso commented 1 year ago

Mozilla's Firefox Android team is debating whether Firefox on tablets should send a UA string with Mobile (like it does today), Android without Mobile (like Chrome), or a desktop UA string (like Safari on iPadOS). Firefox used to send Tablet on tablets, but we stopped because neither Chrome nor Safari do. If a site has conditional content for Tablet UA strings, then it would not be tested by Chrome or Safari users and would likely be a bad experience for Firefox tablet users.

Safari on iPadOS sends a Safari desktop UA string (e.g. Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15).

Chrome on tablets sends a UA string that includes the Android token, but not Mobile (e.g. Mozilla/5.0 (Linux; Android 6.0.1; Nexus 10 Build/MOB31T) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36). Sites checking for Mobile or Mob will send a desktop page layout because there was no Mobile, but sites can still check for Android if they want to promote their native app.

anawhj commented 1 year ago

From a TV manufacturer point of view, the motivation and field naming look good to me. The important point would be the criteria of form factor types, but it seems no way to clearly define.

CSS WG has tried to classify major types of devices(media) such as tv and handheld, but it's rarely used now. https://www.w3.org/TR/mediaqueries/#media-types

In addition, 'automotive' term would be an appropriate one, since W3C has several automotive specs. https://www.w3.org/TR/viss2-core/

The values of TV, VR, mobile, tablet would be acceptable, but I'm not sure the XR case. XR devices can be diversified such as a glass, mobile w/ camera and dedicated one. We can expand more values(types) for upcoming influential future devices later. (robot, watch, fridge, window)

miketaylr commented 1 year ago

@djmitche

I think this could be headed off by clearly defining

What each of these strings means as a signal to the application (so maybe "Tablet" means "handheld, touch-oriented device that is larger than a mobile device and not typically carried constantly"?)

Yeah, I think this is a good idea.

@arichiv

Maybe we want to go the other direction and embrace the chaos a bit. I think Sec-CH-UA-Form-Factor and Sec-CH-UA-Model should be an sf-list and not sf-string.

I think I'm convinced on changing the type to list for form-factor, but the ship has sailed on changing the type for model.

Why don't we keep the existing list, and try to add some language about adding new items (or something) as they become relevant: "Automobile", "Tablet", "TV", "VR", "XR"

Sora2455 commented 1 year ago

Still confused as to when you'd ever need to know that the current browser is an Automotive or a Tablet, and not just a screen of a given size. "VR" or "XR" I could kinda see, as you could eager-load WebXR Device API-based code on those environments, but that's not really much of an advantage, is it?

djmitche commented 1 year ago

I think @mrajah1's first comment addresses a bit of that: from the user's perspective, there's more difference between form-factors than just screen size, relating to how information is presented and how the user interacts with that information (gestures, inaccurate pokes while watching the road, etc.).

Sites currently need to "guess" how to behave based on UA string and other available data, and that guessing causes issues: new devices might not be recognized by important sites, so the device manufacturers "lie" in the UA string; and users have bad experiences when a site mis-characterizes a device and provides e.g., a desktop experience on an auto. Providing a clear signal related to the user's understanding of the device would reduce these issues.

As @arichiv suggested, we'll still need some room for "mess", and I suspect making the header a list, using "SHOULD", and including non-normative descriptions of each value will allow that room.

I think there's a valid question of whether it's necessary to include this information in the HTTP request headers. It can help for situations like eager-loading assets and code related to the form-factor. But probably most uses of this information will be client-side, via NavigatorUAData: getHighEntropyValues().

Speaking of which, this will be high-entropy, as some hint values will always be comparatively rare and thus carry significant information identifying the user.

nielsbasjes commented 1 year ago

I had a look today at the current draft specification of the Sec-CH-UA-Form-Factor and noticed some things I found confusing and also some I found missing.

It currently says the header should have one of these values "Automotive", "Mobile", "Tablet", "TV", "VR", "XR", "Unknown" or the empty string (which implies "Desktop").

Most of these terms are obvious to me, except for the "VR" and "XR". When looking for some understanding on the difference I found this page which says

Extended Reality includes all its descriptive forms like the Augmented Reality (AR), Virtual Reality (VR), Mixed Reality (MR). In other words, XR can be defined as an umbrella, which brings all three Reality (AR, VR, MR) together under one term, leading to less public confusion.

If I understand this correctly then the builder of a browser running on a VR system would have 2 equally valid values to choose from ... which will lead to confusion. With my current understanding I would remove the 'VR' one.

In line with earlier remarks; What is this header intended for?

The current list of values is nudging towards what I call in my own software the DeviceClass. See https://yauaa.basjes.nl/expect/fieldvalues/#deviceclass

At this point I tend towards knowing what technical variant of the website should be provided to the visitor. In my experience that is about things like (similar to what @djmitche said):

Screensize (also resolution and aspect ratio): No screen, Watch, Phone, Tablet, Desktop, TV
Type of interaction: Mouse/Keyboard, Touch, Game Controller, Remote control (i.e. TVs), Voice, Gesture/Motion, ...
Type of usage: From "highly interactive" (game website) to "Pushing a button once an hour" (Netflix).
How "mobile" the device is: Fixed (TV hanging on the wall), Moving as part of a bigger thing (Screen "motionless" in a Car), Moving (Phone, Tablet).

So either the single "Form factor" is really something like the 4 attributes I just mentioned, or it should be simplified (i.e. less values), or the list of values should be a lot longer to describe all variations ... adding Watches, Home appliances, Smart Displays (Google Nest and such), Fixed Gaming Consoles (PS5, XBox), Mobile gaming consoles (Nintendo Switch), Headless systems (GoogleBot, Mastodon server-to-server), eReaders (very slow screens, no animations), etc.

I'm really curious to hear what original the intended meaning is and how you view my points.

djmitche commented 1 year ago

343 tries to address some of these points - I'd love your feedback!

patrickhlauke commented 1 year ago

Coming in late on this, but...did we (as a community or practice) not decide ages ago that we'd want to feature-detect based on more specific characteristics, rather than trying to lump things into very broad, ill-defined, and often overlapping "buckets"? Is this going to be a re-run of the various media types (@media tv, @media handheld, etc)?

djmitche commented 1 year ago

Hah, your timing is impeccable..

I can't speak to the past as I wasn't part of this community group at the time, but that is concerning. What were the issues with those media types?

In this case, we've allowed for "overlapping" by using a list. And perhaps leaning heavily into the "form-factor that users interact with in a meaningfully different way" will help them be better-defined. I think "broad" is a feature.

drwez commented 1 year ago

@djmitche As per @patrickhlauke 's comment, the "form-factor" model (e.g. Desktop vs Mobile) suffers from the bundling of various properties under a single term, that are actually orthogonal.

For example, before touch-screens became common on laptops, Mobile was interpreted as meaning "has touch input", while Desktop meant the device did not. But Mobile was also interpreted to mean "low bandwidth connectivity", which might also apply to Desktop, if the Desktop is actually a laptop. Mobile was also assumed to mean "has a small display", whereas Desktop form-factors would have big displays, and then tablets came along, and so-on :)

jonesiscoding commented 3 months ago

Rather than form-factor, would hinting the pointer fulfill this use case, when used along with other hints (width, in particular)?

By hinting this criteria, it could be used in server-side considerations. As this already exists in CSS, the logic could follow the CSS pointer: fine/coarse@ and @any-pointer: fine/coarse logic.

From that logic (again, in combination with other hints such as Width) it would be relatively easy to determine a general form factor - tablet, phone, hybrid (tablet w/ mouse - like the Surface or iPad with Magic Keyboard), computer, TV, etc.

djmitche commented 3 months ago

To @drwez's commments from 10 months ago (!): the bundling is intentional, as users perceive their device and user-agent not as a bundle of properties but as a cohesive whole. And browser vendors tend to make the same categories, with dramatically different UIs for different form-factors. So I think browsers wouldn't have to do a lot of detection

And the ability to list multiple form-factors means that an exciting new form factor Z with the elevator pitch "like X but also Y" can list form-factors X and Y as well as Z. Rewinding history, maybe that means that the "phablet" form-factor would have been described in the hint as ["phone", "tablet", "phablet"] -- so a site unaware of phablets but with a dedicated tablet UI would use that UI on the new device, but a site aware of phablets could further specialize.

I suspect we'll want to get a bit more practical experience here before making further changes. The hint shipped in Chrome 124, but to my knowledge not in other browsers.

patrickhlauke commented 3 months ago

i admit i still remain unconvinced - would prefer granular hints over broad "buckets", as that's more in the spirit of feature/capability detection.

Sora2455 commented 2 months ago

@patrickhlauke Considering one of the main goals of this feature was to reduce the fingerprinting potential of the user-agent string, "more granularity" is actively counter-productive.

WICG / ua-client-hints

Hints to identify different form factors #333

Cars/ Automobiles

Television

VR

Tablet

343 tries to address some of these points - I'd love your feedback!