Closed jungshik closed 1 month ago
We've went with using the 'backzone' ids in Firefox to avoid the risk to make users in the affected time zones upset. For example canonicalizing 'Europe/Ljubljana', 'Europe/Podgorica', 'Europe/Sarajevo', 'Europe/Skopje', and 'Europe/Zagreb' to 'Europe/Belgrade' (which would be the case when not applying 'backzone') may have negative cultural/political effects.
Relevant comments in the Firefox bug tracker: https://bugzilla.mozilla.org/show_bug.cgi?id=1303091#c3, https://bugzilla.mozilla.org/show_bug.cgi?id=1303091#c9, https://bugzilla.mozilla.org/show_bug.cgi?id=1303091#c11
Unfortunately just using CLDR instead of IANA data can also lead to wrong canonicalizations, cf. https://unicode-org.atlassian.net/browse/ICU-12044 and http://unicode.org/cldr/trac/ticket/9892. In the CLDR bug, jungshik has also given an example where CLDR didn't update the mapping despite being outdated since 1993.
And the related (unresolved) bugs.ecmascript.org bug which also mentions the complications between selecting which tz links are safe to apply and which ones are more contentious: https://tc39.github.io/archives/bugzilla/1892/
And more related threads from the tzdata mailing list (mostly from 2013-2014 when many zones were moved to the backzone file): http://mm.icann.org/pipermail/tz/2014-July/021170.html, https://mm.icann.org/pipermail/tz/2013-September/019821.html, https://mm.icann.org/pipermail/tz/2014-November/021888.html.
'Europe/Ljubljana', 'Europe/Podgorica', 'Europe/Sarajevo', 'Europe/Skopje', and 'Europe/Zagreb' to 'Europe/Belgrade'
Thanks a lot for alerting me about those entries and references to TZ mailing list threads on the topic. A similar sentiment may exist about canonicalizing Asia/Phnom_Penh and Asia/Vientiane to Asia/Bangkok.
Unfortunately just using CLDR instead of IANA data can also lead to wrong canonicalizations,
Yup, you're right. I'm aware of the issue because CLDR sticks to pretty old IDs that had been deprecated well before CLDR project started. (Calcutta vs Kolkata, Saigon vs Ho_Chi_Minh, Katmandu vs Kathmadu and many others).
@yumaoka
@sffc I'm unsure what needs to be done here. Could you tell me what the web reality is? IIUC, Firefox now uses ICU too, but did ICU ever end up taking this into account and start using the backzone file?
@anba What is Firefox doing these days? Is it still necessary to put in the exception to allow backzone to be used for time zone names?
Is there a snippet of code that can reproduce the Firefox/Chrome discrepancy? It appears that Asia/Chongqing and Asia/Shanghai are equivalent in modern times, but may have differed at some time in the past, perhaps before China decided to unify under one time zone.
I wrote the following code in my best attempt to reproduce the difference, but was unsuccessful in finding a difference:
new Date(1945, 0, 1).toLocaleString("en", { timeZone: "Asia/Chongqing", timeZoneName: "long" })
// "1/1/1945, 2:00:00 PM GMT+09:00"
new Date(1945, 0, 1).toLocaleString("en", { timeZone: "Asia/Shanghai", timeZoneName: "long" })
// "1/1/1945, 2:00:00 PM GMT+09:00"
IIUC, Firefox now uses ICU too, but did ICU ever end up taking this into account and start using the backzone file?
I think Firefox still uses a rather large override map (to take care of cases mentioned in this issue) on top of ICU. Firefox already used ICU when this issue was filed, btw. :-)
CLDR has a policy on the ID stability and it's a bit hard to change that, I'm afraid. Given this, I was thinking of what Firefox does in v8 to handle 'Saigon => Ho_Chi_Minh', 'Calcutta => Kolkata', etc, but held it off because I wanted it to be resolved at the CLDR so that v8 does not need a local override map [1]. My (dim) hope for a possible CLDR change was based on my 'findings' that turned out to be false. See below.
As for using 'backzone' (this issue), it's related but a bit different.
Unfortunately just using CLDR instead of IANA data can also lead to wrong canonicalizations, cf. https://unicode-org.atlassian.net/browse/ICU-12044 and http://unicode.org/cldr/trac/ticket/9892. In the CLDR bug, jungshik has also given an example where CLDR didn't update the mapping despite being outdated since 1993.
And, unfortunately, my claim turned out to be false. I thought 'Asia/Calcutta' had been changed to 'Asia/Kolkata' well before the CLDR project started. In https://unicode-org.atlassian.net/browse/CLDR-9892, @yumaoka dug up the historic IANA timezone files and found that as lately as 2008 (well after the CLDR project started) had 'Asia/Calcutta' instead of 'Asia/Kolkata'. He suspected that the same was true of 'Saigon vs Ho_Chi_Minh' and 'Katmandu vs Kathmandu'.
[1] To make things complicated, there's a possibility that the override map needs to be duplicated for Chrome OS, which was yet another reason I wanted it to be resolved in CLDR. An alternative of changing the ICU data locally for Chromium was not desirable, either because that'd make the TZ db update process more complicated (although it may not be that bad).
The repro step is as following:
new Intl.DateTimeFormat("en", {timeZone:"Asia/Chongqing"}).resolvedOptions().timeZone
"Asia/Chongqing" : Firefox "Asia/Shanghai" : Chrome
Without underlying zoneinfo files supporting the historical difference between Asia/Chongqing and Asia/Shanghai, I think it's all but pointless to treat them as separate zones.
Below is what Firefox does with my computer timezone set to America/Los_Angeles. Note that Asia/Chongqing and Asia/Shanghai had different local mean time (they have different longitudes), but the result is the same. The same holds for Asia/Bangkok vs Asia/Phnom_Penh.
new Date(1850,0,1).getTimezoneOffset()
472.96666666666664. # In 1850, LMT was used everywhere including America/Los_Angeles
new Date(1850,0,1).toLocaleString("en")
"1/1/1850, 12:00:00 AM"
new Date(1850,0,1).toLocaleString("en", {timeZone: "UTC"})
"1/1/1850, 7:52:58 AM"
new Date(1850,0,1).toLocaleString("en", {timeZone: "Asia/Shanghai"})
"1/1/1850, 3:58:41 PM"
new Date(1850,0,1).toLocaleString("en", {timeZone: "Asia/Chongqing"})
"1/1/1850, 3:58:41 PM"
new Date(1850,0,1).toLocaleString("en", {timeZone: "Asia/Phnom_Penh"})
"1/1/1850, 2:35:02 PM"
new Date(1850,0,1).toLocaleString("en", {timeZone: "Asia/Bangkok"})
"1/1/1850, 2:35:02 PM"
There are multiple issues, some overlapping, which lead to differences between browsers when handling time zones:
Let's start with accepted time zone strings, because any difference here may have side-effects later on.
ACT
or previous IANA names like Canada/East-Saskatchewan
). Also disallows SystemV
time zones, which are disabled by default in tzdata. Canada/East-Saskatchewan
, even though that one is no longer valid per IANA (but still valid in CLDR!). Recently an extra mapping was added to handle more cases. The parser also rejects SystemV
time zones, but it's not clear to me if that's intentional or just a happy coincidence. SystemV
time zones.Canonicalisation differences between IANA and CLDR for same time zones:
Now let's go over to the backzone
file. First, as a quick reminder, ICU doesn't contain any data for backzone
time zones!
backzone
time zones which ICU claim to support, because they're CLDR time zones:
js> var date = new Date("1800-01-01T00:00:00Z")
js> var dtf = new Intl.DateTimeFormat("en", {timeZone:"Europe/Belgrade", hour:"2-digit", minute:"2-digit"})
js> dtf.format(date)
"1:22 AM"
js> dtf.resolvedOptions().timeZone
"Europe/Belgrade"
js> var dtf = new Intl.DateTimeFormat("en", {timeZone:"Europe/Sarajevo", hour:"2-digit", minute:"2-digit"})
js> dtf.format(date)
"1:22 AM"
js> dtf.resolvedOptions().timeZone
"Europe/Sarajevo"
Rules for "Europe/Belgrade" and "Europe/Sarajevo"
# Zone NAME STDOFF RULES FORMAT [UNTIL]
Zone Europe/Belgrade 1:22:00 - LMT 1884
Zone Europe/Sarajevo 1:13:40 - LMT 1884
CLDR lists "Europe/Sarajevo" as a time zone, not a link:
<type name="basjj" description="Sarajevo, Bosnia and Herzegovina" alias="Europe/Sarajevo"/>
backzone
zones which are links in CLDR.
backzone
zones which are different links in CLDR.
backzone
backzone
.@anba What is Firefox doing these days? Is it still necessary to put in the exception to allow backzone to be used for time zone names?
We're basically still in the same position as when we've originally implemented these overrides for backzone
. Objectively speaking, we're returning the wrong data for pre-1970 time stamps for backzone
zones, but are reluctant to canonicalise to the zones whose data is used for the reasons outlined in https://github.com/tc39/ecma402/issues/272#issuecomment-423928522.
@anba, thank you for the summary as well as the reminder about 'Z' in Date ctor that I forgot.
I also forgot what I wrote about {Asia/Phnom_Penh and Asia/Vientiane} vs Asia/Bangkok . They have the same issue as Europe/Sarajevo and Europe/Belgrade. That is, the third issue in @anba's comment.
The parser also rejects SystemV time zones, but it's not clear to me if that's intentional or just a happy coincidence.
That's intentional omission. The V8 tzname parser rejects everything that is not explicitly handled. SystemV zone name handling is omitted on purpose because it's disallowed.
FYI, @FrankYFTang
@justingrant Thoughts on this issue?
AFAIK, the current plan is for CLDR and ICU to resolve the issues discussed in this thread:
iana
attribute has been added to https://github.com/unicode-org/cldr/blob/main/common/bcp47/timezone.xml, which (after https://unicode-org.atlassian.net/browse/ICU-22452 is implemented soon) will allow ICU clients to fetch the latest canonical ID for IDs like Asia/Calcutta (canonical ID is Asia/Kolkata) and Europe/Kiev (canonical ID is Europe/Kyiv).backzone
, and which should be very close (modulo a few corner cases and judgement calls) to the results you'd get when building TZDB with PACKRATDATA=backzone PACKRATLIST=zone.tab
.Once this CLDR and ICU work is completed and released, we have a choice to make:
A while ago I filed #825 to encourage a decision on the choice of (A), (B), or (C).
Unless there are objections, I think that this issue can be closed as a dupe of that one?
Does #877 fix this issue?
Does #877 fix this issue?
Yes, it resolves the questions raised by this issue.
Closing as resolved by #877
The current spec keeps referring to 'Zone and Link names', but that's not sufficient and leads to a divergence between implementations.
The main question is whether or not to take into account 'backzone' file in the IANA timezone database.
Firefox uses zone and link names in 'backzone' file of the IANA tz db, but some links in 'backzone' file contradicts what's in other files.
backward file has the following:
backzone file has the following:
Note that backzone file has the following comment at the top:
Because Firefox takes into account 'backzone' file, 'Asia/Chungking' is canonicalized to 'Asia/Chongqing' instead of 'Asia/Shanghai'.
ICU/CLDR (as used by v8) ignores 'backzone' file and both 'Asia/Chungking' and 'Asia/Chongqing' are canonicalized to 'Asia/Shaghai' per 'backward' file.
CLDR/ICU, however, do not canonicalize 'Asia/Phnom_Penh' and 'Asia/Vientiane' to 'Asia/Bangkok' despite the following in 'asia' file:
That's because IANA timezone DB relegated the two zone names to links rather recently (2014-2015) and CLDR/ICU do not want to destabilize the tz ID space. So, they kept them as canonical zone IDs.