Enumeration of time zones

ptomato commented 4 years ago

In the Temporal proposal we currently have an API for enumerating all named time zones known to the system. We are currently discussing removing this time zone enumeration API from the proposal as it's not clearly related to Temporal and seems like it is an orthogonal project.

It was originally added in response to this use case: the list of named time zones is useful for implementing a time zone picker in UI. (Although, to be useful for UI, there would also have to be a way to get human-friendly display names for time zones rather than IANA names; #31?)

ECMA-402 seems like a good place to investigate this.

leobalter commented 4 years ago

It seems interesting to discuss.

sffc commented 4 years ago

Agreed; this seems roughly related to Intl. We need to put the enumeration API somewhere (could be Intl, could be elsewhere), and then we need to add time zone names to Intl.DisplayNames.

@FrankYFTang

littledan commented 4 years ago

I think enumeration is a broader problem than just timezones. Some other things that you might want to enumerate:

Locales (all of them, not just from a subset of a list that the user provides as we have with supportedLocalesOf)
Regions
Currencies
Calendars
Numbering systems
Months of the year/days of the week
IIRC someone asked that the allowed hourCycle settings be listed, even though that's a fixed set... There are many other options that might be included this way

I don't think we need to ship all these together, but it'd be nice to come up with an API pattern that would potentially capture all of them. That's why I was a bit skeptical of tying Timezone enumeration to Temporal: there's no clear way to extend this to regions, locales or currencies, but those seem quite important as pickers.

Let's watch out for data size issues in Intl.DisplayNames with timezones, but it could definitely be useful if someone wants to make a timezone picker.

leobalter commented 4 years ago

My only concern is that Intl is optional and not available in all JS platforms (e.g. Moddable XS). If we identify use cases for some of these enumerations out of the Intl API, there is a chance we might want this enumeration elsewhere.

Still, Intl might be a good place to provide this enumeration, and I wouldn't complain. This also doesn't seem to be a heavy weight addition.

littledan commented 4 years ago

I don't know whether this should be in Intl or not; my main concern is that we find an API design that will eventually work for all sorts of enumerations.

The only use case I've heard about for enumeration of timezones was for a sort of timezone picker. The consensus among internationalization experts is that this only really makes sense if the timezones are localized (even if some application developers are OK with displaying IANA names, we don't want to encourage this). If there's a non-picker use case, then that might change the calculus.

Overall, it's a bit hard for me to think about the space of environments where some parts of JS aren't present. I imagine some of them have other restrictions or allowances that aren't specifically sanctioned by the specification; we may or may not want to permit these in the standard. If we end up making major decisions on the basis of this kind of optionality/grouping, I wonder if we might weaken/unconstrain things to permit engines to let them support enumeration but not other parts.

ljharb commented 4 years ago

I can localize my own time zone names, i don’t need Intl to do it for me (and because Intl isn’t everywhere, i can’t rely on it anyways). All i need is the data, in 262.

littledan commented 4 years ago

@ljharb Which data do you need, and why do you need it?

ljharb commented 4 years ago

The list of valid time zone identifiers. Otherwise, i have to maintain my own list, and laboriously validate each one at runtime.

The goal is to know what timezones an engine supports, so i can, among many other things, set up backend validation, and notify myself when i need to support a new timezone.

anba commented 4 years ago

A list of things which directly pop into my head when thinking about time zones:

Should this possible API be restricted to IANA time zone names or do we also want to allow CLDR or ICU time zone names? (I'm not sure if https://github.com/unicode-org/icu/blob/master/icu4c/source/tools/tzcode/icuzones is covered by CLDR, which would make the time zones listed there ICU-only.)
- SpiderMonkey tries to restrict the possible time zones to only match what's in IANA: https://searchfox.org/mozilla-central/source/js/src/builtin/intl/TimeZoneDataGenerated.h
- V8's time zone parser doesn't accept most of these ICU time zones.
- JSC just passes everything directly to ICU.
- The differences are visible for time zone names like "PST" or "Canada/East-Saskatchewan".
Any possible time zone name which is accepted by Intl.DateTimeFormat, or just the canonical names?
Any thoughts about the time zones in backward, pacificnew, or systemv?
"timezones an engine supports" is tricky, because sometimes engines are lying a bit: #272
When displaying time zones to the user, directly showing IANA time zone names should probably be avoided, because of issues like "Kiev" vs. "Kyiv". (See the numerous threads about this topic on the tz mailing list.)
- CLDR provides "exemplar cities" for displaying purposes.
- But engines are currently stripping these exemplar city names from the ICU data file, so there are data size issues we should be aware of.

ljharb commented 4 years ago

Perhaps an object, whose keys are the canonical names, and whose values are all the valid aliases.

littledan commented 4 years ago

@ljharb Because the list of timezones is so long, it's a common pattern in applications' timezone pickers to show a subset, so it's unclear to me what signal application developers should take based on just the existence of a timezone in a browser's tzdb. Does this need occur in the frontend, or is it more of a development-time need, or some other context?

ljharb commented 4 years ago

The object form I suggested, run through Object.keys, seems like it'd be a subset?

My use case is for both the frontend (rerendering in the client), and the backend that generates the initial HTML hydrated in the frontend.

FrankYFTang commented 4 years ago

ICU's API provide 3 style of enumeration call that we can use to surface to JS

return the whole list by calling icu::TimeZone::createEnumeration()
return the list of timezone for a specific country/region by calling icu::TimeZone::createEnumeration( region code )
return the list of timezone in a offset I think 1 and 2 above are useful and I have some doubt about 3 .

FrankYFTang commented 4 years ago

How about

Intl.DateTimeFormat.getSupportedCalendars() Intl.DateTimeFormat.getSupportedTimeZones() Intl.NumberFormat.getSupportedNumberingSystems() Intl.NumberFormat.getSupportedCurrencies() Intl.NumberFormat.getSupportedUnits()

and later we may let each of above take optional argument to restrict the return list

sffc commented 4 years ago

These enumerations are not really specific to specific Intl formatters. If we were to add methods like this, I think it makes more sense to put them on the top Intl namespace:

Intl.getSupportedCalendars() Intl.getSupportedTimeZones() Intl.getSupportedNumberingSystems() Intl.getSupportedCurrencies() Intl.getSupportedUnits()

FrankYFTang commented 4 years ago

ok, I will start to champion an "Intl Enumeration API Specification" to address this. Start to draft it under https://github.com/FrankYFTang/proposal-intl-enumeration/blob/master/README.md now.

zbraniecki commented 4 years ago

Btw. This is a major fingerprinting increase as basically the API does nothing else but add identifiable bits.

ljharb commented 4 years ago

@zbraniecki seems more like it collects existing bits into a single list, as opposed to me having to maintain the list myself and laboriously feature-test against the runtime? iow, not a new capability, just happens to make it easier?

zbraniecki commented 4 years ago

I'm not a security/privacy expert, so please, take my read with a grain of salt, but my understanding is that the race for privacy vs fingerprinting is composed of two pieces:

1) Number of APIs that give me the highest number of uniquely identifiable information 2) Selection of APIs that give me the highest number of uniquely identifiable information at the lowest CPU/time

Number (1) is important because all/any anti-fingerprinting attempts will have to mask all those APIs to return some jammed responses that are generic and unidentifiable Number (2) is important because if my tracker needs to take 10 seconds of your CPU to get a fingerprint its hard to hide. If my tracked can get it in 16ms, I'm good.

Now, if I understand correctly, Intl API originally was designed to force the fingerprint script to cycle through API calls attempting to ask for various bits and checking the output in hope to collect a bit. That's time consuming and CPU costly. On the other hand getting a white-hat API use was easy - just ask for a date, tell me your calendar of choice, and accept the result (which may be suboptimal).

The "give me all available/supported X" type of API is making it trivial to ask for all fonts, all calendars, all languages, all numerical systems.

The common driver for such requests are "pickers", and I recognize the value for a picker to know what's available. I'm not sure how to resolve that tradeoff and I would love to get some privacy/security experts involved in guidelines for API design to strike the right tradeoffs.

Otherwise, non privacy experts will keep adding fingerprinting APIs as the API surface grows, and then privacy engineers will struggle to add "anti-fingerprinting" masking mode to each and every one of them. That seems suboptimal.

FrankYFTang commented 4 years ago

Why don't the hacker just read the user agent string instead? That will cost less CPU power, right? How would this API provide more fingerprint information than the user agent string?

zbraniecki commented 4 years ago

Why don't the hacker just read the user agent string instead?

Funny you should ask: https://www.zdnet.com/article/google-to-phase-out-user-agent-strings-in-chrome/

And anti-fingerprinting is always masking your UA string anyway.

How would this API provide more fingerprint information than the user agent string?

UA string adds some bits of entropy, your screen dimensions, color depth, add more, your installed plugins, refresh rate (vsync) even more, and so on.

You can see an example of such finderprinting on https://panopticlick.eff.org/ if you click "Test me" and then "Show full results for fingerprinting". If you use Tor browser, or turn on anti-fingerprinting bits in Safari, Firefox, Tor, Brave etc. you'll see how they fake many of those API results to make them less unique. As we increase the surface, Intl bits are becoming part of the "game". The question is what design should we use to make it a costly process for the fingerprinters, or how can we make our APIs easy to mask for anti-fingerprinting techniques.

As I said, I'm not an expert, I just know that I often end up reviewing patches for Gecko/SM that add the masking and I remember the reasoning behind "supportedLocalesOf" rather than "getSupportedLocales".

FrankYFTang commented 4 years ago

sorry . accidentally click the wrong button.

FrankYFTang commented 4 years ago

you'll see how they fake many of those API results to make them less unique.

then why can't they fake the Intl API results to make them less unique ALSO?

FrankYFTang commented 4 years ago

And anti-fingerprinting is always masking your UA string anyway.

Then anti-fingerprinting can also mask this API too , right?

ljharb commented 4 years ago

As for cpu time, once you’ve fingerprinted enough engines, you don’t have to do every check - only the differentiating ones.

sffc commented 4 years ago

We've established that fingerprinting is a theoretical concern. How do we go about getting a definitive answer on whether or not this blocks the proposal? I would lean toward presenting the proposal, raising this concern to plenary, and giving people an opportunity to object. Then we can talk to them and figure out how to make this proposal less susceptible to fingerprinting.

zbraniecki commented 4 years ago

then why can't they fake the Intl API results to make them less unique ALSO?

They can! And they will! I think there are ways we can design our APIs that make it easier for them to do so, or harder for the fingerprinter to use. I just don't know what those approaches are.

What I'm advocating against is just slapping more and more API surface for "pickers" without fingerprinting consideration and advise from people who'll later have to anti-fingerprint those APIs.

zbraniecki commented 4 years ago

How do we go about getting a definitive answer on whether or not this blocks the proposal?

I'd like to get Tor people involved in this discussion. I don't know any, but I can ask around. I'd like them to look at Intl APIs already existing and the current proposals (including this one) and ask what can we do to make their life easier and any general advise for our work in the future.

zbraniecki commented 4 years ago

I'd also like to dive deeper into usability of that.

What does getSupportedTimeZones give me? Is it possible my system only supports several timezones? What happens if the user is not in one of them? Same for currencies, or numbering systems... Do apps realistically will have pickers for numbering systems?

zbraniecki commented 4 years ago

The only picker I can realistically see being commonly used is the unit picker and its still not a generic "what units do you support overall", but rather "do you support both celsius and kelvin and fahrenheit" kind of picker.

sffc commented 4 years ago

I'd like to get Tor people involved in this discussion. I don't know any, but I can ask around.

@zbraniecki Please do. Thanks!

jswalden commented 4 years ago

Seems to me you can determine whether any individual time zone is supported by just using it and seeing if it shows up as a resolved option. For purposes of revealing differentiations across UAs and across their successive versions, an enumeration API does not expose new information. It makes it easier to query in bulk, but if the differentiations pertain to specific time zone strings, an attempt to fingerprint could just check behavior of those specific time zone strings.

Or is this a more theoretical concern, about user agents that the would-be fingerprinter hasn't taken the time to individually figure out the distinctions of? Because I guess an enumeration API does mean the fingerprinter doesn't have the ongoing maintenance burden of figuring out which time zones are differentiably supported by distinct UAs and UA versions -- it could just grab the whole list and generate a hash from it, for fingerprinting purposes.

ljharb commented 4 years ago

It seems to me like not exposing the list is just security by obscurity.

zbraniecki commented 4 years ago

@zbraniecki Please do. Thanks!

Hi all. I hear your feedback. I understand that it's hard for me to explain the concerns around the privacy area, and since I'm not an expert in the area, I may not even be able to.

I want to ensure you tho, that it is not "security by obscurity" - the idea is not just to hide the information or make it harder to retrieve in an attempt to discourage the fingerprinting. If I made you feel this way, it is just the shortcoming of my ability to explain my position.

I reached out to several people working on the Tor browser at Mozilla and I'll try to get them to help us make decisions around this area.

The topic is complex and several dynamics are intertwined, making it harder to design clear guidelines as multiple tradeoffs are in play. What concerns me personally, is that it seems to me like ECMA402 group is currently not seeing that as any form of tradeoff, and rather see it as a clear "someone requested, let's add it" kind of situation. It further indicates to me that I either don't understand the problem scope or I failed to explain my concerns.

I'll try to get back to this thread within the next week or so with more feedback on how to design such APIs in a privacy-friendly manner.

sffc commented 4 years ago

Wearing my hat as ECMA-402 chair: the contents of this post doesn't necesarilly reflect my personal opinion

The consensus from the ECMA-402 meeting today is that we think this proposal has solid use cases, but acknowledge the potential fingerprint concerns. We plan to present it for Stage 1 at the upcoming TC39 meeting, and continue investigating the privacy and security implications before it reaches Stage 2.

zbraniecki commented 4 years ago

I spun off #442 for the generic conversation about the scope of ECMA402, which I believe is important for assessment of this API.

I'd also like to say that I did not see the "solid use cases" list beyond "pickers" and did not receive an answer to my question about them.

Putting privacy aside, and putting the generic "How far should ECMA402 go" aside as well, I'd like to better understand what makes this a "solid use case". In particular, I'd like to understand how the group sees a difference from "any use case" and "solid use case". Basically every API ever added to any library ended up there because there was a use case. I worry not about every request to ECMA402 being able to bring some use case.

But I struggle to evaluate whether the user case is solid. For example, when I started with ECMA402, one of the strategies we used was to see what JS libraries get developed with an assumption that if the use case is important enough, people will develop userland libraries, and from that we can collect in-field experience, validate that a use case is high profile and common enough to gather momentum around a library or libraries, and extract low-level API that can make such libraries easier or even unnecessary.

I don't know if this is applicable to today's velocity and dynamic behind ECMA402, but I don't think I've seen userland libraries around time zone names, numbering system names and calendar names.

I also don't know if "pickers" is the right use case - should the "pickers" be hand-written, or part of HTML? What is the use of "pickers" outside of Web environment? (they don't help much Node.js, right?). Are there other uses than "pickers"?

I understand the Stage 1 and I hope to see motivation for the API in the Stage 1 proposal that can be verified against the outcome of #442.

ljharb commented 4 years ago

@zbraniecki anything the browser needs, is helpful in node, because generating HTML on the server is a very important a11y/performance/robustness practice.

zbraniecki commented 4 years ago

anything the browser needs, is helpful in node, because generating HTML on the server is a very important a11y/performance/robustness practice.

I'm not sure if I understand. You can generate <input type="date"/> on server side. Or you can write your own date picker. Those two have very different API surface requirements.

ljharb commented 4 years ago

Sure - but assuming there's no native HTML control for a timezone picker, eg, you'd need every possible timezone to be present in the serverside HTML, generated in node. I agree that if a native form control existed, that was sufficiently styleable and hookable in the browser, then there'd likely be no need to expose the data as a list.

litherum commented 4 years ago

(I didn't realize this discussion was happening here, and opened https://github.com/FrankYFTang/proposal-intl-enumeration/issues/1 about it)

zbraniecki commented 4 years ago

I agree that if a native form control existed, that was sufficiently styleable and hookable in the browser, then there'd likely be no need to expose the data as a list.

I see. Thank you for your patience!

I think my question is then, should we evaluate the native form control path for pickers, rather than API scope extension as the more privacy friendly, easier to get internationalization right, and lower overhead for the user, approach first?

It may be related to #442 and #443

sffc commented 4 years ago

My personal opinion on this feature request:

We have heard from multiple Temporal stakeholders that exposing this API covers their use case of making a time zone picker. This information is already available via Intl APIs, but less efficiently. In Temporal, where time zones and calendars are first-class objects, one can also imagine use cases expanding beyond only pickers.

If you are building a client-side app and want to let people select their time zone, calendar system, etc., right now you would hard-code an expected list, even if a browser engine is capable of doing more. I think it is better for the JavaScript engine to provide a list of what it can support than making the programmer start with their own list and essentially take the intersection of that list with the browser's list by feature-testing each entry.

Although we should consider also supporting this in HTML, I still think this proposal has merits in JavaScript. The ecosystem is likely to never reach a point in which all web sites can use only the W3C pickers. Although I like using them in personal projects, I can't remember the last time I've visited a web site that has used a native HTML date picker in production, for example. I would hope that we can at least agree that JavaScript-based pickers are a legitimate use case.

ljharb commented 4 years ago

Also, if the effort of creating the list is acceptable for the majority of devs, the good users, why wouldn’t the minority that are malicious just do the same? Someone will probably make a library for it nigh in immediately anyways, so any effort barrier vanishes.

aphillips commented 4 years ago

The thing that has me confused about this thread is that the list of time zones is finite (if rather larger than the logical minimum and unstable to boot) and, ignoring Etc/offset "private-use" values, reasonably well-defined. So it's not exactly "fingerprintable" to get a list of available time zones.

Making a time zone picker is a little more complicated, since many time zones need to be "rolled up" into a representative zone and a bunch of zones are obsolete. Hence all the "metazone" gunk in ICU (or, in my case, a bunch of utility classes).

@sffc I agree that this should be supported in HTML--in fact W3C I18N has asked for first-class time zone support in HTML going back a ways and I should probably follow up on that with WHATWG in the near future--but I also agree that JS APIs should provide access also.

sffc commented 4 years ago

Good point about metazones and containment. I guess we should consider exposing that additional information in an enumeration API? A flat list of time zones will get a lot of obsolete junk.

ljharb commented 4 years ago

@sffc https://github.com/tc39/ecma402/issues/435#issuecomment-628166399

tc39 / ecma402

Enumeration of time zones #435