Open carlosjeurissen opened 2 years ago
I support Chrome's behavior, which seems more reasonable. Usually, developers want language-region
locale to fallback to language
locale first, then the default locale.
If the browser wants to support multiple different behaviors at the same time, I recommend add a new property in the 3rd parameter(options) in this api.
@hanguokai generally speaking this can be useful to reduce the overall package size.
However, there are cases when the fallback might not always be welcome. Say your zh/messages.json
is in Simplified Chinese script, and zh_TW
is in Traditional Chinese. Would it be great if a different script is used as fallback? Same can happen with other languages with multiple scripts, like Serbian (Latin and Cyrillic).
I know the difference, it's better than the default(complete another language like English). For example,
en: "Software"
zh: "软件"
zh_TW: "軟體"
The difference between "软件" and "軟體" is smaller than that of English.
For the best user experience, developers need to supply full message map(1:1) if they are different. Only when they are the same or acceptable, they can be omitted.
@hanguokai relying on good developer behaviour can be tricky. I can imagine there are Chinese people knowing only English and either Traditional / Simplified Chinese? I could be wrong?
Do you know how many people only understand Chinese(zh-CN and/or zh-TW) but not English? Of course, there are real examples in every situation(combinations).
I said in my previous post:
If the browser wants to support multiple different behaviors at the same time, I recommend add a new property in the 3rd parameter(options) in this api.
There are multiple possible strategies. Another possible fallback strategy is following navigator.languages
order. For example:
If navigator.languages
is ['zh-TW', 'en'], then the search order is zh-TW -> en -> extension default locale
.
I believe Safari matches Chrome here after looking at the code.
Reached out to the ltli w3c group here: https://github.com/w3c/ltli/issues/35.
Safari currently matches the behaviour of Chrome. If from above discussion is concluded this the preferred process, Firefox will follow.
Quick update, @aphillips mentioned two potential fallback algorithms. One being a simple progressive removal of subtags. And the other being the more advanced algorithm from the Unicode's CLDR used in ICU. See: https://github.com/w3c/ltli/issues/35#issuecomment-1295168890
@xeenon @oliverdunk Do you know which algorithm is used in Safari and Chrome? Based on this we can figure out what algorithm should be used in Firefox considering the lack of any fallback algorithm in Firefox (Except to the default_locale).
@carlosjeurissen Safari removes subtags, which we coded to match Chrome.
I had a brief look through the code and Chrome appears to remove subtags as @xeenon suspected 👍
Some of us (@dotproto, @Rob--W, @oliverdunk, @carlosjeurissen) met with the I18n group (@aphillips, @eemeli and others) and discussed the topic of whether to fall back (partial minutes). Chrome and Safari already have the same behavior of falling back from specific language tags to less specific ones, ultimately to default_locale
. Firefox is supportive of implementing the same, and there was already a feature request at https://bugzilla.mozilla.org/show_bug.cgi?id=1381580.
Arguments in favor of the multiple fallback include the ability to have smaller message.json files, e.g. generic English + small en-US and en-GB specific files.
@Rob--W Since the fallback process is being updated, can the following https://github.com/w3c/webextensions/issues/258#issuecomment-2280230511 be relevant as it suggests an additional step in the fallback chain?
@Rob--W Since the fallback process is being updated, can the following #258 (comment) be relevant as it suggests an additional step in the fallback chain?
I don't see the relevance of that other issue. The issue here is about unifying the fallback behavior across browsers (basically for Firefox to match Chrome and Safari). What you are proposing is an additional step, but the referenced comment mentions a feature request that has not been adopted by any browser.
@Rob--W I believe @erosman is trying to say once this language fallback logic has been improved in firefox, it is more valuable to extension authors to have a way to make use of the fallback logic using getMessage with a specific locale tag or using some form of setLanguage() versus just loading message.json files directly.
@birtles I'll take a look. We do look for the sub-tags first, but there might be a bug somewhere.
@birtles I'm not seeing any issues with Safari's locale fallback in Safari 18. We use zh_CN
and zh_TW
for Simplified Chinese and Traditional Chinese on Apple platforms. Your change to rename zh_hans
to zh_CN
is correct for Safari (and seems fine for Chrome and Firefox).
@birtles I'm not seeing any issues with Safari's locale fallback in Safari 18. We use
zh_CN
andzh_TW
for Simplified Chinese and Traditional Chinese on Apple platforms. Your change to renamezh_hans
tozh_CN
is correct for Safari (and seems fine for Chrome and Firefox).
Thank you so much for looking into this. I'll follow up in the issue you kindly commented on since I'm not quiet yet able to get this working in Safari 18.
I filed Chromium issue 375528194 for the fact that Chrome doesn't seem to recognize zh_hans
, only zh_CN
.
zh-CN
and zh-TW
are language code + region code.
zh-Hans
and zh-Hant
are language code + script code.
zh-Hans-CN
, zh-Hans-SG
, zh-Hant-HK
and zh-Hant-TW
are language code + script code + region code.
However, due to historical reasons, some operating systems, browsers and other softwares still use or support only zh-CN
rather than zh-Hans
. In #641 , we also discussed it (See link-1, link-2).
After my change in https://commits.webkit.org/285633@main, Safari will support script codes in _locales
— including three part locale identifiers.
We have always reported the script (if used) in i18n
locale APIs as well.
Firefox patch can be found here: https://phabricator.services.mozilla.com/D224084
Not all browsers handle language fallbacks the same. Considering the following situation:
An extension is using the native i18n APIs with
"default_locale": "en"
inmanifest.json
, and threemessages.json
files in the languagesen
,pt
andpt-BR
.Both
en
andpt
include the message idsmessage1
andmessage2
. Whilept-BR
includes onlymessage1
.In the above situation, browsers handle fetching
i18n.getMessage('message2')
different.Chromium first checks
pt_BR/messages.json
, if the message is not present, it checkspt/messages.json
, and finally, if the message is still not found, it will check thedefault_locale
, in this caseen/messages.json
. In the above situation, this means it gets the message2 value frompt
.In Firefox, however, the browser first checks
pt_BR/messages.json
. If the message is not in this file, it will directly fallback todefault_locale
. so it checksen/messages.json
. Resulting in message2 value becomes the one fromen
.Interestingly enough, in Firefox, if
pt_BR/messages.json
is not present in general, it will checkpt/messages.json
first, before checkingen/messages.json
.What is the behaviour we want in these cases?