strongloop / strong-globalize

strong-globalize is built on Unicode CLDR and jquery/globalize and implements automatic extraction of strings from JS source code and HTML templates, lint the string resource, machine-translate them in seconds. In runtime, it loads locale and string resource into memory and provides a hook to persistent logging.
Other
25 stars 16 forks source link

g.http() and g.setLanguage() should support mapping zh-cn to zh-Hans and zh-tw to zh_Hant #150

Closed codechennerator closed 4 years ago

codechennerator commented 4 years ago

I couldn't find if this was already implemented, and I couldn't find it in the docs so I think this may be a legitimate issue...

If a request.header has accept-language of zh-cn or zh-tw, strong-globalize should equivocate that to zh-Hans and zh-Hant respectively. This would be in accordance to CLDR's instructions on how to implement them: http://cldr.unicode.org/index/cldr-spec/language-tag-equivalences.

There is a line that says "Identifiers like "en", "en-US", and "en-Latn-US" are all valid, and refer to the same entity. The shorter form is generally preferred, but many implementations use longer forms. For best interoperability, implementations should be prepared to accept any of them as equivalent. " and "This means that zh ~ zh-CN ~ zh-Hans ~ zh-Hans-CN, and that zh-Hant ~ zh-TW ~ zh-Hant-TW."

Current behavior does not reflect this. For example:

g.http({headers: {'accept-language': 'zh-cn'}}).f('Hello.'); // 'Hello'
g.http({headers: {'accept-language': 'zh-Hans'}}).f('Hello.'); // '你好'

Interestingly, because of the way the getLanguageFromRequest works, the behavior works for Portuguese:

g.http({headers: {'accept-language': 'pt-BR'}}).f('Hello.'); // 'Olá'
g.http({headers: {'accept-language': 'pt'}}).f('Hello.'); // 'Olá'

I'm looking to make a fix, and I'm seeing that these lines in runtime/src/helper.ts:

  acceptLanguage.languages(appLanguages);
  const bestLanguage = acceptLanguage.get(reqLanguage);

is the reason why we lack the functionality. I think the npm accept-language module is not sufficient in selecting the bestLanguage.

Are there any requirements or things I should know before introducing any new changes or packages? Thanks in advance :)

dhmlau commented 4 years ago

@bajtos @raymondfeng , what's your thoughts on this?

codechennerator commented 4 years ago

I made an example app of the issue. strong-globalize-example.zip

raymondfeng commented 4 years ago

See https://github.com/strongloop/strong-globalize/blob/master/packages/runtime/src/strong-globalize.ts#L354. We might have to hard-code the mapping there.

jannyHou commented 4 years ago

Thank you @raymondfeng and cc @codechennerator I created a PR to support the alias: https://github.com/strongloop/strong-globalize/pull/151 feedback is welcomed.

codechennerator commented 4 years ago

I cloned your branch and did some of my own tests. I've tested the mapping for pt-BR and zh-tw and the alias works very well! Thanks for the PR, looking forward for the release!

jannyHou commented 4 years ago

@codechennerator (sorry forgot to update the release here) Glad to know it solves your problem! Published as 5.0.3 :)

jannyHou commented 4 years ago

I am closing the story as PR released. Feel free to open it if you have other questions.

codechennerator commented 4 years ago

@jannyHou Sorry I think our tests missed the scenario I'm running into now: The http function doesn't map the alias properly when the alias language is not in the appLanguages.

For example: Say my acceptedLanguages (which is determined by my folder structure) is cs, en, zh-Hans, zh-Hant.

g.http will not map properly to zh-Hans because zh-cn is not in the appLanguages. Instead, the acceptLanguage function will return the first language in the appLanguages list. (In this case, cs) When getLangAlias is called it simply matches cs with cs, but by then the language is wrong.

This was not caught in the test because in the test we included zh-cn as part of the appLanguages. However alias conversion should work if appLanguages = en, zh-Hans. Otherwise, the intl folder structure would be:

intl
├── en
├── zh-cn
├── zh-Hans

which is incorrect if zh-Hans is to equivocate to zh-cn.

Here is an example of what I mean. strong-globalize-example_2.zip

codechennerator commented 4 years ago

Maybe something like this? https://github.com/strongloop/strong-globalize/pull/153 Its not passing CI but all the unit tests are passing.