umijs / umi

A framework in react community ✨
https://umijs.org
MIT License
15.43k stars 2.66k forks source link

[Feature Request] Enhance locale file detection to support Unicode LDML identifiers #12659

Open baohouse opened 3 months ago

baohouse commented 3 months ago

Background

In the locales folder, you put files of the message bundle, e.g. en-US.ts, zh-CN.json, or fr.js. The format is generally 2-letter language code (ISO 639-1), followed by base separator - or _, and then an optional 2-letter region code (ISO 3166-1 alpha-2). However, Filipino has a language code of fil, which is 3 letters (ISO 639-2). So naming a file fil.js will not get picked up.

The reason for this is that the code currently uses a regex that assumes 2-letter language code.

https://github.com/umijs/umi/blob/35be997a5b6c51b6f7e4fa248bbd031351b01e0d/packages/plugins/src/utils/localeUtils.ts#L90-L92

Proposal

Enhance the code to use a regex that supports Unicode LDML. In addition to supporting 2 or 3-letter language code (e.g. zh, fil, en), it allows for optional script tag, so sr-Latn and sr-Cyrl (Serbian written in Latin and Cyrillic respectively) would be supported, and the region code can either be 2 letters or 3 digits, e.g. es-419 which is Spanish as spoken in Latin America.

Additional context

There is ietf-language-tag-regex library which provides a regex to match on IETF BCP 47 which is the basis for Unicode LDML.

Jinbao1001 commented 2 months ago

what about use tl_PH?

baohouse commented 1 month ago

what about use tl_PH?

That is currently what I am doing, but I have to transform the value to the correct fil when I send the locale code to our API headers. I also synchronize the message bundles to SimpleLocalize (which uses Google Translate and expects the code to be fil) and have to rename the folder to fil, perform the sync, then rename the folder back to tl_PH. I could automate that with a script, but I'd rather spend time implementing a proper fix.