Open nnmrts opened 5 months ago
@zbraniecki Thoughts on this?
TG2 discussion: https://github.com/tc39/ecma402/blob/main/meetings/notes-2024-11-25.md#why-is-there-no-intllocaleprototypevariants-900
There were questions about motivation (most use cases for variants are better served by a corresponding Unicode extension keyword), as well as the shape of this getter (does it return a list? is the list sorted? or does it return a string with multiple subtags?)
I see, thanks for having the discussion!
Since when is the variants part in CLDR "deprecated" though?
I sadly can't remember my exact use-case and should have included it in my original post, but I think it was about two things:
1901
, 1996
)pehoeji
)The latter got added to the IANA language subtag registry in March this year. I know that isn't CLDR, but I was under the impression that this file is the "source of truth" for registered language subtags used in CLDR and everything else.
I also don't see any kind of "deprecation" of variants here: https://www.unicode.org/reports/tr35
Regarding the type of an eventual variants
, I don't see any issue with using an array or even a set here.
I don't know what
The Japanese one from one to two, it’s complicated
is referencing and I don't see the difficulty of parsing variants. They can only ever be 5-8 long alphanumeric strings and they can only be followed by extensions and private use tags, so what's wrong with .split("-")
? 😆
While we're at it, I don't see a reason why extensions
and private-uses
also aren't getters, but I guess that's a different story.
A little more context: I think people on the call were referring to variants as "legacy" or "deprecated" because of the following issues:
valencia
is better as -u-sd-esvc
(see https://www.unicode.org/cldr/charts/44/supplemental/territory_subdivisions.html#esvc)pinyin
is better as -t-i0-pinyin
(see https://github.com/unicode-org/cldr/blob/33a95a266905f494cc7a912749024f2dbb989de8/common/bcp47/transform_ime.xml#L16C16-L16C22)"sl-IT-rozaj-biske-1994"
would be canonicalized to something like "sl-IT-1994-biske-rozaj"
even though the IANA subtag registry says it should be "rozaj-biske"
In other words, the comments from the discussion were based on the point of view that variants are basically a grab bag of things that would be better expressed as more specific locale extensions.
Personally, I still think variants are motivated because they remain the standard way of expressing orthographies. Something like el-polyton
is a good, modern language identifier that I don't believe has another representation with extension keywords.
Regarding the type of an eventual
variants
, I don't see any issue with using an array or even a set here.
Returning an ECMAScript Set
is an interesting proposition since it avoids the point of contention on whose ordering to use (IANA's or Unicode's).
Returning an ECMAScript
Set
is an interesting proposition since it avoids the point of contention on whose ordering to use (IANA's or Unicode's).
I don't think it would, because ECMAScript Set instances are deterministically ordered.
Yeah, I mainly suggested the Set
because it follows the other rule of variants (or any "multi-tag"): uniqueness. But honestly, for most users it would probably be unexpected to get a Set
here in comparison to the rest of JS.
Regarding the ordering of variants: I don't really think the array or set needs to be ordered in any specific way other than "the same as supplied". Both CLDR and IANA, if I understand correctly, just define a recommended or canonical way to order them in the context of language subtags, not in the context of JavaScript arrays. And AFAIK implementers need to be able to understand any ordering.
One could even argue that it's more expected if the ordering is the same as the user specified it, even if it's "wrong".
So in general, the ordering, of all things, shouldn't be a the blocker here.
Why is there no
Intl.Locale.prototype.variants
? There are getters forlanguage
,region
andscript
but I saw no information about the reasonvariants
is missing or shouldn't be there as well.