Open 747 opened 3 years ago
Hello. Thank you for your report.
Can you please provide more realistic example? We both know that RuFooBarBaz
is out of R18n (real-world projects) scope.
I'd glad to try to understand, cover with tests and fix it.
To be honest, region
was introduced just a day before, as I see: https://github.com/r18n/r18n-core/commit/009cadf039343da3b8653350594084ca3aabe2e9
And there were no reports for 2.5 years, so, I guess, it's not a big deal. 😅
Also we have tests for "different locales (regions) under the same parent locale" here: https://github.com/r18n/r18n-core/blob/28c1d46/spec/r18n_spec.rb#L195-L206
So… I can understand a code error, but I want to know what better to test, how it affects projects.
Indeed, now I see most locale classes with a secondary element are named in a format like EnUS
so that the behavior is "correct" for them.
What it harms are those such as SrLatn
in this repository's built-in locales.
require 'r18n-core'
R18n.set "en-us"
puts R18n.t.yes # => "Yes"
R18n.set "zh-tw"
puts R18n.t.yes # => "是"
R18n.set "sr-latn"
puts R18n.t.yes # => "Yes" <- falls back to English even .yml exists!
open('sr.yml', 'w:utf-8') do |sr|
sr.puts "'yes': да"
end
open('sr-latn.yml', 'w:utf-8') do |srl|
srl.puts "'yes': da"
end
R18n.default_places = '.'
R18n.set "sr-latn"
puts R18n.t.yes # => "да"
So maybe no one from Serbia has used this gem 🙄.
And when we're at it, what would you say to supporting script subtags? Outside sr-Latn
and sr-Cyrl
, there's kk-Latn
upcoming, and some real world examples such as zh-Hant-HK
(because they may use both Simp. and Trad. variants in Hong Kong) exist.
It seems a lot more complicated than I thought.
For example: https://en.wikipedia.org/wiki/IETF_language_tag#Extension_U_(Unicode_Locale)
So, "locale" can have a lot of "tags". And the second one can be either region or script or anything else.
Meh.
Two ideas:
sr-SR-Latn
is possible, I guess?).I think the latter would be a well-balanced option. You should also support 3-letter language codes as in the standard.
(Note that script comes before region, so it must be sr-Latn-SR
and not ~sr-SR-Latn
~ in that case. And SR
is confusingly the country code of Suriname and not Serbia, so "Serbian spoken in Serbia written in Roman alphabet" will be sr-Latn-RS
.)
It seems a lot more complicated than I thought.
For example: https://en.wikipedia.org/wiki/IETF_language_tag#Extension_U_(Unicode_Locale)
The whole system of IETF language tag is indeed complex, but half of them (including what you cited) are for domain-specific or backward compatibility things not immediately needed for user-facing locales.
Almost all cases can be covered with three elements: language
-script
-region
. If you want a step smarter thing with relatively small effort, consider also accepting one variant
in the place of region
(so that language
-script
-variant
). This is good for sub-country official languages such as Scottish English en-scotland
or Valencian ca-valencia
(because IETF tags are not designed to handle ISO subdivision codes very well).
In
r18n/locale.rb
:https://github.com/r18n/r18n-core/blob/f7bc3003763d51cb92e768b51036e5943ef91e54/lib/r18n-core/locale.rb#L121
This line seems to have an easy logic error, because:
Perhaps the problem has been elusive because it was introduced with the parent locale function (https://github.com/r18n/r18n-core/commit/2c88300c8ae4a5b4b7841ef1ff3c035174c1106c), and no one tried to use different locales under the same parent locale at once.