Open hsivonen opened 1 year ago
Maybe hourCycle
is available via -u-hc-
? Still leaves hour12
(which happens to be what appears in various preference UIs).
hourCycle
is in icu_datetime::options::components::Bag
.
-u-hc-
works.
I don't think we have hour12
except that you can mostly get that via -u-hc-
but you need to choose between h11/h12 or h23/h24 instead of letting the locale choose.
Letting the locale affect hour12
expansion to hourCycle
seems required for ECMA-402 compliance.
AFAICT, by the time the [Date]TimeFormatter
constructor has done the provider-level resolution of the locale data, there doesn't appear to be a way for the application code to modify the instance to perform hour12
resolution in application code.
macOS Ventura and Gnome expose a boolean system pref for this topic, so hour12
might be relevant to support to be able to honor system preferences. I don't have test code that I'd know would behave according to the system boolean pref semantics. However, from reading the ECMA-402 spec, I'm a bit surprised at how the ECMA-402 semantics are supposed to work. AFAICT, hour12=false
for en-US
would resolve to h24
. Do users actually want that result instead of h23
? (As a user of the en-US locale for untranslated strings with 24-hour clock enabled, I don't want h24
. I guess I'll need to make a TODO item of observing the system clock at midnight.)
Looking at https://github.com/unicode-org/cldr-json/blob/80a94b0f6c3a34d6e2dc0dca8639a54babc87f94/cldr-json/cldr-core/supplemental/timeData.json#L4 , I observe:
h
or H
, i.e. h12
or h23
.k
and K
do not appear in allowed cycles. hb
and hB
do, but I don't find a spec explaining what they mean.Given that the preferred cycle for each locale is either h12
or h23
, it's unclear to me what problem h11
and h24
or hour12
expanding to h11
or h24
solve.
However, from reading the ECMA-402 spec, I'm a bit surprised at how the ECMA-402 semantics are supposed to work.
The current spec is incorrect. There's a PR to fix this → https://github.com/tc39/ecma402/pull/758.
However, from reading the ECMA-402 spec, I'm a bit surprised at how the ECMA-402 semantics are supposed to work.
The current spec is incorrect. There's a PR to fix this → tc39/ecma402#758.
Thanks. Given that change and the non-existence of k
-default and K
-default locales, I guess one option would be to have ICU4X ECMA-402 wrapper code hard-code hour12=true
to h12
and hour12=false
to h23
, and close this API request as WONTFIX.
2.
k
andK
do not appear in allowed cycles.hb
andhB
do, but I don't find a spec explaining what they mean.
b
and B
are day period markers: https://unicode.org/reports/tr35/tr35-dates.html#dfst-period
k
andK
do not appear in allowed cycles.hb
andhB
do, but I don't find a spec explaining what they mean.
Oops. I missed that. So: K
is allowed in one locale but isn't the default anywhere and k
is specced for completeness and isn't in use anywhere?
b
andB
are day period markers: https://unicode.org/reports/tr35/tr35-dates.html#dfst-period
Thanks.
Oops. I missed that. So:
K
is allowed in one locale but isn't the default anywhere andk
is specced for completeness and isn't in use anywhere?
Yes. K
isn't the default hour-cycle for Japan per <timeData>/<hours>
, but when selecting {hour: "numeric", hour12: true}
, the resolved pattern will contain K
, see here. That also means it's not possible to replace hour12=true
with hourCycle=h12
.
For example new Intl.DateTimeFormat("en", {hour:"numeric"})
can be customised as follows:
Options | Skeleton | Resolved Pattern | Final Pattern |
---|---|---|---|
{hour:"numeric"} | j | h a | h a |
{hour:"numeric", hour12: true} | h | h a | h a |
{hour:"numeric", hour12: false} | H | HH | HH |
{hour:"numeric", hourCycle: "h11"} | h | h a | K a |
{hour:"numeric", hourCycle: "h12"} | h | h a | h a |
{hour:"numeric", hourCycle: "h23"} | H | HH | HH |
{hour:"numeric", hourCycle: "h24"} | H | HH | kk |
And new Intl.DateTimeFormat("ja", {hour:"numeric"})
can be customised as follows:
Options | Skeleton | Resolved Pattern | Final Pattern |
---|---|---|---|
{hour:"numeric"} | j | H時 | H時 |
{hour:"numeric", hour12: true} | h | aK時 | aK時 |
{hour:"numeric", hour12: false} | H | H時 | H時 |
{hour:"numeric", hourCycle: "h11"} | h | aK時 | aK時 |
{hour:"numeric", hourCycle: "h12"} | h | aK時 | ah時 |
{hour:"numeric", hourCycle: "h23"} | H | H時 | H時 |
{hour:"numeric", hourCycle: "h24"} | H | H時 | k時 |
In an input skeleton, h
is automatically matched to either h
or K
in the resolved pattern. Similarly, H
is matched to either H
or k
.
Spec:
The allowed
strings in <timeData>/<hours>
are mostly relevant for the C
skeleton, so it's not yet relevant ECMA-402 date-time formatting. (Spec: https://unicode.org/reports/tr35/tr35-dates.html#availableFormats_appendItems)
They're possibly relevant for the stage-3 "Intl Locale Info" proposal. There's a HourCyclesOfLocale operation, which is spec'ed to return the hour-cycle formats which are in "common use for date and time formatting". So this operation could return the allowed
values from <timeData>/<hours>
.
ICU4C doesn't have a public API to retrieve the allowed
values, though. Instead it's necessary to manually read the resource data, cf. DateTimeFormat::GetAllowedHourCycles.
Yes.
K
isn't the default hour-cycle for Japan per<timeData>/<hours>
, but when selecting{hour: "numeric", hour12: true}
, the resolved pattern will containK
, see here. That also means it's not possible to replacehour12=true
withhourCycle=h12
.
Thanks. So ICU4X is currently missing a way to handle hour12
in a data-driven way.
Just so that I understand the feasibility of hard-coded special cases if this issue isn't addressed in ICU4X itself: It would be possible for ECMA-402 implementation glue code to get correct results (with the scope of what's known about what is in CLDR) by expanding the boolean hour12
and the boolean "region is JP" to hourCycle
, right?
That is:
if hour12 {
if region_of_locale_is_JP {
h11
} else {
h12
}
} else {
h23
}
They're possibly relevant for the stage-3 "Intl Locale Info" proposal. There's a HourCyclesOfLocale operation, which is spec'ed to return the hour-cycle formats which are in "common use for date and time formatting". So this operation could return the
allowed
values from<timeData>/<hours>
.
The rendered spec that you linked to has HourCyclesOfLocale
, but the README claims "Hour Cycle DROPPED by Champion". @FrankYFTang , is the current intention to include or exclude HourCyclesOfLocale
?
ICU4C doesn't have a public API to retrieve the
allowed
values, though. Instead it's necessary to manually read the resource data, cf. DateTimeFormat::GetAllowedHourCycles.
I don't see any non-test callers for that method. What am I missing?
Just so that I understand the feasibility of hard-coded special cases if this issue isn't addressed in ICU4X itself: It would be possible for ECMA-402 implementation glue code to get correct results (with the scope of what's known about what is in CLDR) by expanding the boolean
hour12
and the boolean "region is JP" tohourCycle
, right?
It needs to be hard-coded on the language, not the region, because the date-time patterns are in https://github.com/unicode-org/cldr/blob/main/common/main/ja.xml.
I don't see any non-test callers for that method. What am I missing?
Only the parts relevant for the "Unified Intl API" work (bug 1686965) have been committed in bug 1693576. The rest will be put up for review when the open issues in the proposal have been resolved.
There is some interesting code to handle some of this resolution logic in components/datetime/src/pattern/hour_cycle.rs
I also observe that we already have the preferred hour cycle (h11h12 or h23h24) in ICU4X data: https://github.com/unicode-org/icu4x/blob/main/provider/datagen/tests/data/json/datetime/timelengths%401/en.json\
So I think everything is here to support hour12
if we were to add it to an options bag somewhere.
Just to make this clearer for those playing along here. Japan is the only country that allows support for the use of K
value for times.
<hours preferred="H" allowed="H K h" regions="JP"/>
The options are h
, H
, K
, k
and are defined as such:
https://unicode.org/reports/tr35/tr35-dates.html#dfst-hour
Currently ECMA spec incorrectly assumes a coupling of h
-k
and H
-K
. That is the following is baked in as implicit assumption:
0-11
/00-11
(K
) will present twenty four hour time as 00-23
(H
)1-12
/01-12
(h
) will present twenty four hour time as 01-24
(k
)The ECMA standard definitely needs to change as the current implementation is a bug. The universal (as far as I've been able to determine) rejection of k
, and the only occasional adoption of K
as an option renders the above assumption absolutely incorrect, and realistically should have been identified prior to publication. https://github.com/tc39/ecma402/pull/758 has identified a solution that expands how 12-hour and 24-hour time is presented at a regional level. Work is ongoing to get this to a point of acceptance. This is slated for 2023-09 TC39 meeting.
ECMA-402 has
hour12
andhourCycle
overrides for the locale's hour cycle. AFAICT, ICU4X[Date]TimeFormatter
constructors don't have API surface for this override. For ECMA-402 compat, ICU4X should have API surface for this.