tc39 / ecma402

Status, process, and documents for ECMA 402
https://tc39.es/ecma402/
Other
538 stars 107 forks source link

Allow for implementations to retrieve settings from host environment #109

Open zbraniecki opened 8 years ago

zbraniecki commented 8 years ago

Operating systems allow users to specify their intl related preferences. Currently, the only variable that ECMA402 retrieves from the host environment if not defined by the caller is timeZone.

The list of options that OS may have values for DateTimeFormat:

for NumberFormat:

We can argue that in every possible scenario, if the developer doesn't specify an option, the user experience would be better if the Intl implementation followed user preferences over locale defaults.

On the other hands, I can see an argument, that we should only follow user preferences if they were explicitly changed, so that in 99% cases where user does not explicitly state any defaults, we follow CLDR, not OS defaults.

I think that such a balance would work well as a middle ground between following user preferences but staying consistent between browsers and OSes.

Such an approach also allows browser vendor to allow the user to specify that he does not want the browser to expose his preferences to avoid fingerprinting. In such case, the browser would always return that user did not specify anything and Intl API would follow CLDR.

We could also recommend implementators to not retrieve OS preferences if in a given environment there is no way to distinguish between manually set and OS defaults.

In order to achieve that we would need to modify the spec (internals only) to allow for the implementation to attempt to retrieve host env. variables before looking into CLDR defaults, but we'd also need to work with OS vendors to make it distinguishable between user setting a value explicitly, and the value being just the default.

Feedback?

zbraniecki commented 8 years ago

@bterlson does it sound like something that Microsoft would be interested in?

littledan commented 8 years ago

Many of these have BCP 47 items corresponding to them. Would it make sense to tie this into #106?

caridy commented 8 years ago

It is definitely tied, but I will like to clear up #106 first. I'm not even convinced yet that Intl.Locale is really needed.

srl295 commented 8 years ago

hourCycle

CLDR BCP47 -u-hc-h12 | h23 | h11 | h24 etc

numberingSystem

CLDR BCP47 -u-nu-taml etc

zbraniecki commented 8 years ago

Many of these have BCP 47 items corresponding to them. Would it make sense to tie this into #106?

So, after spending a couple weeks experimenting with it for our platform, I actually don't believe it to be necessarily tied.

I see two ways we can solve the problem I described here:

1) We can try to carry the bits from the host environment to ECMA402 via bcp47 2) We can extend ECMA402 to allow implementations to reach for host environment options as a step in the fallback chain of resolving the options

The former solution is tempting because it requires us only to extend the coverage of BCP47 unicode extension keys in formatters (like, make DateTimeFormat support hourCycle extension key) and provide some API for retrieving those (like, proposed navigator.locales).

The problem with this approach is that it ties us to what extension keys exist and can carry. If we find a host environment preference that we identify as worth following, but not covered by bcp47, then we'll find ourselves in the very same place having to solve it again.

And it just so happens, that according to our data, the second most popular setting (after hourCycle) is not, and probably will not be, covered by bcp47 - date/time format.

All OSes allow users to customize their date/time formats. The simpler settings are just select from a list of patterns ("mm/dd/yyyy" vs "dd/mm/yyyy" for example), but both Windows and MacOS allow for much more even allowing users to write their own patterns.

Trying to carry this data through bcp47 seems impossible, but allowing ECMA402 implementations to take the pattern from OS is entirely possible.

It also happens that both OSes allow users to select patterns for date and time in styles short, medium, long, and full.

If we allowed style in DateTimeFormat (#108) and allowed to optionally let implementations to retrieve host environment settings, we could get a significantly improved UX that can be further carried onto next APIs.

For that reason I'd suggest we go with the option (2) and implement it independently of Intl.Locale and BCP47 extension keys (which we should support as well imho, just not as a way to solve the host env. options).

littledan commented 8 years ago

@zbraniecki Do we need to expose all of the full OS options this way? If it's possible to do something that's analogous to the CLDR, rather than inventing our own abstractions, then the implementation burden will be a lot lighter. I'd wager that these post-CLDR/post-BCP47 internationalization options should be considered lower priority than the ones already included in it. And, when we want to add such exotic options, maybe we can add them to BCP47 as well. Any thoughts, @srl295 @jungshik

zbraniecki commented 8 years ago

Do we need to expose all of the full OS options this way? If it's possible to do something that's analogous to the CLDR, rather than inventing our own abstractions, then the implementation burden will be a lot lighter.

Not sure what you mean in case of date/time format.

I'd wager that these post-CLDR/post-BCP47 internationalization options should be considered lower priority than the ones already included in it. And, when we want to add such exotic options, maybe we can add them to BCP47 as well.

Not sure how. It would be really hard to convey full date/time pattern in BCP47 unicode extension key.

And I would say that date/time formats are exotic customizations, they're pretty exposed in all OSes and currently we just ignore them.

Also, it seems that ICU is interested in getting host environment preferences to be within ICU itself. So we'd get an ICU API that takes host env. preferences into account - it would be nice if ECMA402 API did the same and I feel like it's more complete if we allow implementations to just do this instead of trying to squeeze everything into BCP47 unicode extension key.

jungshik commented 8 years ago

seems that ICU is interested in getting host environment preferences to be within ICU itself.

ICU bug number?

I wonder if this issue (retrieving host settings part) belongs to Email 402.

zbraniecki commented 8 years ago

ICU bug number?

I don't think there's a bug yet, but I raised the question at the weekly call and the group expressed interest. I believe Microsoft wants to kickstart the effort so I'll wait for them a week before I file an issue myself.

I wonder if this issue (retrieving host settings part) belongs to Email 402.

If you mean ECMA402, I believe so. ECMA402 spec is not bound to ICU so if we want to get a certain functionality we have to express it in the spec independently of ICU. If ICU will gain this feature, and ECMA402 will not change, implementers who rely on ICU should turn it off there in order to comply with the spec.

That's why I'd like to add wording to the spec that explicitly state that implementations may look into the host environment for preferences after checking options and unicode extensions keys and before reaching for the default settings for a given locale.

littledan commented 8 years ago

@zbraniecki If we do this, we need to be sure to mark it as a fingerprinting concern (cf https://github.com/tc39/ecma402/issues/110). It would be great to have a ruling from the W3C TAG on whether the addition would be permissible, cc @slightlyoff .

rxaviers commented 7 years ago

Hi @zbraniecki,

Follow some questions/concerns below...

If I understood it right, you are suggesting that {style: 'medium'} (and its other analogous forms, when #108 is implemented) should return a different output from OS to OS, because it should use the OS definitions of such format instead of a consistent source like CLDR. Is that correct?

Using OS preference as a default value for timezone seems very straighforward and benefical to me, althought using OS preferences as default values for those other data sources isn't very clear to me on what the benefit is.

Exemplifying…

For the timezone case… If a timezone is X, independently if it was retrieved from the OS, or passed in via API (either via locale u-tz-X or via options {timeZone: 'X'}), the formatted output is always going to be the same for that timezone (for a certain locale) on any OS.

For the date format case (or time format)… If using {style: 'medium'}, the formatted output is going to be inconsistent (in purpose) from OS to OS. There would be no way of making it consistent unless not using such styles (and instead using the options bag, i.e. skeletons). It isn't clear to me what the use case / benefit of it is.

For the hourCycle… What would take precedence, the OS preference or the locale preference? If OS preference... that could forbid an app to correctly set a locale other than the local environment, for example using Intl.DateTimeFormat('en-US') in an en-GB environment would return '15:33:17' instead of '3:33:17 PM', i.e., it would be impossible for an app to set an abritrary locale and have it behave as it should because the OS preferences would interfere in a way that cannot be controled.

In general, I'm confused about the inconsistencies this could generate.

zbraniecki commented 7 years ago

If I understood it right, you are suggesting that {style: 'medium'} (and its other analogous forms, when #108 is implemented) should return a different output from OS to OS, because it should use the OS definitions of such format instead of a consistent source like CLDR. Is that correct?

No, no. I don't know if it should.

I'm saying that it seems like there's interest in figuring out how to display medium style date formatted according to user preferences in the OS.

It isn't clear to me what the use case / benefit of it is.

I see your point about difference between OS to OS and that there wouldn't be a way to make it consistent. I think it's solvable, but first I want to answer to this.

While the idea that being able to display the same pattern everywhere is tempting for us from the consistency/dev standpoing, it's much less clear that that's what user expects. We develop for many platforms, user uses just one - theirs. And from our experience at Mozilla, people want the format they chose. Especially if they manually modified it (be it set the hourCycle, or even the whole date/time pattern), it means they really want it.

Now, when they open an app in their OS, it will display the date/time according to their preferences. But when they open a web-app, it will display some different, standardized for their locale date/time. That's precisely what they took effort in switching away from. That's a very unhappy user.

It contributes to the idea of death by the thousand papercuts. This itself will likely not make the user switch away from web-apps, but it'll be one of the things that will together accrue into the perspective that "native apps are better" because they integrate with the OS better.

So, I believe there's a real tangible UX value in looking for ways to address it.

There would be no way of making it consistent

One way to make it possible to make it consistent is an option:

let dtf = new Intl.DateTimeFormat('en-US', {
  style: 'medium',
  useOSStyle: false
});

I don't know if it's the best solution, but it would solve the particular problem while leaving what I believe most developers who're not very deeply familiar with the issue want (show the user the date in the format they want to see).

In general, I'm confused about the inconsistencies this could generate.

I think it's an intrinsic duality between vertical vs horizontal consistency. In this allegory, vertical is "My web app displays the date 2017/01/10". Horizontal is "My OS uses yyyy-MM-dd date format and all dates in all apps are displayed using it".

Since ECMA402 is "best effort" approach, I believe that most common scenario for us is that users want to communicate the date in the way that is the closest to what the user wants to see. And that may be specified in their OS preferences.

rxaviers commented 7 years ago

Thanks for the clarification. It's a very nice goal. The only precaution I think we need to take is to give developers the control of overriding each of these defaults if they will. The useOSStyle: false could be one way. I don't have any suggestion at the moment.

caridy commented 7 years ago

Related to Intl.Locale and navigator.locale. More investigation and discussion needed before we can make a decision about this feature.

littledan commented 7 years ago

In particular, the undecided question is whether we should put all the data in navigator.locale or whether that will be insufficient, and it would be better to support pass-through of OS-specific styles in Intl constructors.