User Locale Preferences

romulocintra commented 1 year ago

Request for position on User Locale Preferences

WebKittens who can provide input: @Constellation

Information about the specification

Title: User Locale Preferences
GitHub repository: https://github.com/romulocintra/user-locale-client-hints
Slide Deck About User Locale Preferences use cases

Design reviews and vendor positions

TAG Design Review:
Mozilla standards-positions issue: https://github.com/mozilla/standards-positions/issues/751

Bugs tracking this feature

WebKit Bugzilla:
Radar:

Anything else we need to know

Introduction

"Find a reliable way to access user OS preferences to craft a better user experience, and improve accessibility over the web"

User preferences are often system-wide settings (such as in Android, macOS, or Windows). Operating systems allow the user to specify custom overrides for settings such as:

Hour cycle (24-hour or 12-hour time)
Calendar system
Measurement unit preferences (metric or imperial)
Date/time patterns
Number separators (comma or period)

However, there’s currently no reliable way to access this information from the Web Platform to help craft better user experiences. Allowing web developers to access this information would allow them to improve the accessibility and usability of their websites, and bring the user experience of web applications closer to that of native applications.

Feedback

I welcome feedback in this thread, but encourage you to file bugs against

hober commented 1 year ago

@johnwilander @litherum

hober commented 1 year ago

@hsivonen's comment on mozilla/standards-positions#751 seems relevant, as is his comment on the proposal itself.

litherum commented 1 year ago

tl;dr: we are hesitantly positive on this general direction, provided UAs must deliberately bucket users into large groups to avoid fingerprintability.

We have quite a few thoughts on this topic, some of which are fairly subtle.

Background

We recognize that exposing user preferences is a valuable problem to try to improve, in general. The problem goes beyond mere user preferences; some cases (e.g. writing list markers in the wrong script) actually make the content unintelligible for the user.
The main trade-off here is privacy. For a user with a few user preferences set, that user could be very recognizable to a fingerprinter. And setting these preferences is more common than one might expect.
The state of the art today is, for websites that want to react to user preferences, each website asks the user to specifically input any relevant preferences they have directly into the website. For example, a weather website asks the user, in that site’s UI, whether they prefer Fahrenheit or Celsius. Whereas a calendar website asks, in that site’s UI, which day of the week they prefer the week to start on, etc. From a privacy perspective, this is actually pretty close to optimal:
1. Because there is an inherent cost to asking the user for some information, websites are already incentivized to ask for as little as possible.
2. By default, users don’t give up any information. All information they give up is opt-in by each user
3. Each user can expose a different set of information to each website they visit. The user can choose to expose more information about themself to a more reputable website.
However, there is a lot of content on the web that would casually benefit from these user preferences. For example, consider sites that aren’t a whole web-app dedicated to data processing, but instead just casually want to show a number with the separators that the user understands.
WebKit already has privacy mitigations regarding our use of navigator.language. We don’t naively expose the entire language list from System Settings; instead we have a set of buckets, and we partition every user into one of those buckets. We intentionally pick the number of buckets and the partitioning mechanism to try to create a deliberate balance between fidelity and privacy.

Position

We’d like to try to split the problem of user preferences in internationalization on the web into two pieces:

Making content intelligible (e.g. solving the “We used the wrong script to format numbers in, and the user literally cannot read the result” problem)
Individual personalization (e.g. solving the “this user just happens to prefer using Celsius for some reason” problem)

Conveniently, these two pieces naturally give rise to different solutions.

Intelligibility

For the "making content intelligible" problem, we are generally dealing with large groups of users, but groups for which locale alone is not sufficient to make content intelligible. For this problem, we’re willing to expose the additional information to make the content intelligible.
1. A requirement of this is that we can build on the existing privacy mitigations we already use for navigator.language: We’ll still be partitioning users into large buckets, and each bucket may now represent extra information in addition to just locale.
One fear with exposing user locale preferences is that some UAs may choose to forego all fingerprinting mitigations and naively expose all user preferences directly to the web unfiltered. If doing so was in accordance with the spec, that would be a dealbreaker.
1. One mitigation to this problem that has had some success in another web standardization group is for the spec to place a literal limit in the spec on the number of distinct equivalence classes (“buckets”) that users may be placed in. This requires that UAs deliberately design their data exposure, to perform an intentional fidelity-vs-privacy tradeoff.
  1. After all, even if the UA didn’t intentionally design their buckets, there will be de-facto buckets in practice, regardless of whether it was an intentional decision or not. Adding a requirement for UAs to cap the maximum number of buckets serves to guarantee that the bucketing strategy in the UA is an intentional design, rather than happenstance.

Individual Personalization

For the "individual personalization" problem, we don’t want to give up the near-optimal (from a privacy point of view) current practice of users deciding which information they are willing to give up to each individual website. So we actually don’t want to go beyond the current state of the art here.
If website maintainers feel that the user experience of the user filling out their preferences in a form on the site is a poor experience, it would be reasonable for the browser to aid websites in collecting the necessary information, possibly via a mechanism like a new form control just for this purpose, or by hooking up support to the browser's form autofill facilities. Neither of these possibilities involve prompts, so these kinds of solutions provide a better user experience and better privacy than a prompt from a Javascript API.
1. Such a solution would need to involve website maintainers who could help participate in the design process, to make sure whatever we design would be sufficient for real-world use.

Thanks, Apple’s WebKit team

litherum commented 1 year ago

I've gotten a bunch of requests for information on how we perform our existing bucketing for navigator.language, so I'll provide some more context here.

The implementation is inside a system framework. This is just because its implementation needs to be shared by CFNetwork (for the Accept-Language: header) and WebKit (for navigator.language). Also, it was more convenient to place in a system framework because Apple's language experts are more familiar with making changes to that. (The reason doesn't have anything to do with secrecy or anything like that.)
Apple's language experts have deliberately crafted a (fairly large, actually) set of languages which may be exposed to the web. Each user will expose a single item from this set. We have intentionally crafted this set of languages to try to balance fidelity of experience with privacy. We want every bucket to be as big as possible, but no bigger.
We already have language selection infrastructure for native apps. Imagine a user running a native app, where the native app is localized in language X, Y, and Z, and the user's system is set up to be in language W. We already have logic to pick which of X, Y, or Z the app should be shown in to the user. That's very similar to this problem on the web - we have a bunch of possible languages we can expose, and we need to pick one, so we need to pick the best one given whatever the user's languages are in System Preferences. Our bucketing mechanism for the web uses the same mechanism.

We implemented this stuff years ago. It works.

marcoscaceres commented 9 months ago

There is an updated proposal (which I've not looked at yet) https://github.com/ben-allen/locale-extensions

marcoscaceres commented 9 months ago

Ah, I see #227 was filed.

marcoscaceres commented 9 months ago

I'm closing this one in favor of that one.

WebKit / standards-positions