This relies on the infrastructure from ECMA-402 to give sensible answers about language support even in the presence of many subtags.
@aphillips, this is what I came up with after consulting with @sffc.
It has a couple of implementation-defined parts, namely the use of LookupMatchingLocaleByBestFit, and a similar operation when deciding how to allocate "base" languages between more-specific variants. (See the example given for Chinese.) Maybe the latter could be rephrased to use LookupMatchingLocaleByBestFit, to reduce this? Thoughts welcome.
My understanding is that this implementation-definedness is largely a function of everyone relying on ICU which is not specified, but we've all kind of agreed to be fine with.
This doesn't fully solve the "language arcs" problem discussed in https://github.com/WICG/translation-api/issues/11 in the context of translation. (And, I wouldn't want to close that issue until we have a full spec for translation anyway.) It's only for the summarizer API so far, which has the simpler question "is this single language supported?" The path to language arcs shouldn't be so hard from here, though.
The end result seems to be pretty reasonable. In particular, it should match ECMA-402 APIs. Since ECMA-402 allows me to do things like new Intl.Collator(["en-US-Braille-x-pirate"]) and get a resolved locale of en-US, or "ja-Bopo-BR" and get a resolved locale of ja, the proposal is that our AI APIs will do the same.
I'm going to merge this for now as I am doing some other spec restructuring and I want to put it on top of this. Regardless, any review or help is appreciated, even after merging.
This relies on the infrastructure from ECMA-402 to give sensible answers about language support even in the presence of many subtags.
@aphillips, this is what I came up with after consulting with @sffc.
It has a couple of implementation-defined parts, namely the use of LookupMatchingLocaleByBestFit, and a similar operation when deciding how to allocate "base" languages between more-specific variants. (See the example given for Chinese.) Maybe the latter could be rephrased to use LookupMatchingLocaleByBestFit, to reduce this? Thoughts welcome.
My understanding is that this implementation-definedness is largely a function of everyone relying on ICU which is not specified, but we've all kind of agreed to be fine with.
This doesn't fully solve the "language arcs" problem discussed in https://github.com/WICG/translation-api/issues/11 in the context of translation. (And, I wouldn't want to close that issue until we have a full spec for translation anyway.) It's only for the summarizer API so far, which has the simpler question "is this single language supported?" The path to language arcs shouldn't be so hard from here, though.
The end result seems to be pretty reasonable. In particular, it should match ECMA-402 APIs. Since ECMA-402 allows me to do things like
new Intl.Collator(["en-US-Braille-x-pirate"])
and get a resolved locale ofen-US
, or"ja-Bopo-BR"
and get a resolved locale ofja
, the proposal is that our AI APIs will do the same.Preview | Diff