twitter / twitter-cldr-rb

Ruby implementation of the ICU (International Components for Unicode) that uses the Common Locale Data Repository to format dates, plurals, and more.
Apache License 2.0
669 stars 93 forks source link

convert_locale falls back to language #236

Closed mkaplan9 closed 4 years ago

mkaplan9 commented 4 years ago

Summary

Right now, if you use a valid BCP-47 code that has multiple parts (i.e. de-DE) that is unsupported, TwitterCLDR falls back to the default locale. This change has it try the language (i.e. de) first before falling back to default.

Future improvements

This change is not robust to certain edge cases. Some examples:

CLAassistant commented 4 years ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

:white_check_mark: camertron
:x: mkaplan-stripe
You have signed the CLA already but the status is still pending? Let us recheck it.

coveralls commented 4 years ago

Coverage Status

Coverage increased (+0.005%) to 96.009% when pulling 50b1398464ea8f717416c48510609c2e3f23cbc6 on mkaplan9:mkaplan-convert-locale-falls-back-to-language into 4f3053aef5d98d248dc95a46f03d79d4fbbac4e9 on twitter:master.

camertron commented 4 years ago

Hey @mkaplan9, thanks for the PR! As it happens, we have almost all the information necessary to select a better fallback locale via the TwitterCldr::Shared::Locale class. Fallbacks could be selected by doing something like this:

fallback = TwitterCldr::Shared::Locale.parse(candidate).max_supported.dasherized

I said "almost" above because unfortunately this snippet won't turn es-CR into es-419... yet. #237 should fix that though.

I would suggest incorporating max_supported into your fallback PR. When that's done we can land both PRs and cut a new release.

mkaplan9 commented 4 years ago

Oh that's great, didn't see that before! It seems like that handles the cases for both TWITTER_LOCALE_MAP and lowercase_locales_map so convert_locale can become

def convert_locale(locale)
    TwitterCldr::Shared::Locale.parse(locale).max_supported.dasherized
end

Does that sound right?

mkaplan9 commented 4 years ago

@camertron I changed to use max_supported as suggested. I kept TWITTER_LOCALE_MAP and lowercase_locales_map since there's a complicated interplay between how they work and supported_locale? (which is used by supported). I had to change several tests, a lot of them because this changes the behavior for invalid locales. convert_locale used to hand them right back unchanged, and now it falls back to en-US (which is the max_supported of DEFAULT_LOCALE). I think this is fine but might be unexpected for some use cases.

TBH not sure if this was exactly what you had in mind so LMK what you think :)

PS - Thanks for being maintainer on this repo! It's nice to see it become more active again. I'd be happy to do some more PRs/review/etc. if it would be helpful.

mkaplan9 commented 4 years ago

@camertron 👍 I made your suggested change, much better backwards compatibility! I removed the to_sym calls since normalize_locale takes care of that (thus it continues to return the passed locale instead of erroring if the passed locale is not a String or Symbol).

I opted to change the test cases with :jp since that's not really a valid language code, but could also special case.

The git history is slightly messed up from my rebase, I left it as is assuming we can squash on merge, but I can also squash everything on my end and push up if better.

camertron commented 4 years ago

Published as v6.1.0. Thanks @mkaplan9 :)