Open pasabanov opened 1 month ago
Implement full support for POSIX modifier
Do you have a rough idea what this would entail? This is an area I'm not familiar with. The level 2 canonicalization described here look close to what you are mentioning, but I'm not 100% sure!
Generally I'd be fine making this handling more robust at the cost of complexity as long as we don't need to drag in any large ICU crates to handle bundling various chunks of locale/region data used for mapping correctly.
Do you have a rough idea what this would entail? This is an area I'm not familiar with. The level 2 canonicalization described here look close to what you are mentioning, but I'm not 100% sure!
I'm not an expert it this field either. The link that you attached is about ICU locales. As I understand it, this is the third locale type along with POSIX and BCP 47. Your library is working with BCP 47 locales, as far as I know, so the conversion algorithm should be different.
For now I'm unsure, what the algorithm should be exactly.
Some useful links for further investigation:
Your library is working with BCP 47 locales, as far as I know, so the conversion algorithm should be different.
The reason I mentioned it is because the ICU one appears incredibly similar to the BCP47 format and places like MSDN say "This format is used by Windows and many other environments, including ... ICU, ...". The ICU formalization seems to have, at minimum, cribbed the region code variants and handling.
If the above holds up (I could try doing a more detailed comparison), then this statement is valid for sys-locale
's considerations because we are parsing POSIX locales:
Level 2 canonicalization is designed to translate POSIX and .NET IDs, as well as nonstandard ICU locale IDs.
This should be reopened due to the unresolved conversation about POSIX to BCP 47 modifiers conversion.
Whoops, you're right. The GitHub autoclose syntax grabbed it by mistake in your PR.
That's because I wrote "partially resolves ..." there. GitHub didn't recognize the word "partially".
According to this specification, the POSIX locale is defined as:
For example, the locale
De_DE@dict
is valid.However, the current implementation of the library does not check for the
@
character, leading to an invalid locale detection when the codeset is not present but the modifier is.Example:
The simplest possible solution would be:
.
and@
characters. Resolved in #33.However, since some POSIX modifiers might be convertible to BCP 47, a more complex solution would be: