globalizejs / globalize

A JavaScript library for internationalization and localization that leverages the official Unicode CLDR JSON data
https://globalizejs.com
MIT License
4.8k stars 605 forks source link

ICU Message Validation and Negative Plural #874

Open nbouvrette opened 5 years ago

nbouvrette commented 5 years ago

I started playing with Gloablizejs recently and found the following 2 behaviors regarding ICU messages which I was wondering if were features or issues?

1) If I output invalid ICU messages, with for example collision in plural categories, or invalid plural categories, the message will still get through and ignore errors silently, Some examples below:

Globalize.loadMessages({
        en: {
            test1: [
                "You have {count, plural, one {# hot dog} one {# hamburger} one {# sandwich} other {# snacks}} in your lunch bag."
            ],
            test2: [
                "You have {count, plural, one {# hot dog} thisIsNotValid1 {# hamburger} thisIsNotValid2 {# sandwich} other {# snacks}} in your lunch bag."
            ]
        }
    });

    console.log(Globalize( 'en' ).messageFormatter( 'test1' )({count: 1}));
    // Output: You have 1 sandwich in your lunch bag.
    // Expected output: exception thrown because the plural "one" is used multiple times.
    console.log(Globalize( 'en' ).messageFormatter( 'test2' )({count: 1}));
    // Output: You have 1 hot dog in your lunch bag.
    // Expected output: exception thrown because the plural "thisIsNotValid1" and "thisIsNotValid2" are not valid.

2) If I take back the last example for test2 but set the count to -1, the plural rule seems to use other while in fact, it should be using one - I think this is probably a general rule that applies to most languages but I could not find much detail on this rule.

rxaviers commented 5 years ago

Hi, thanks for filing this issue.

1) A better error handling is welcome, though it's important to consider this: It's acceptable to throw exceptions for development issues, but it's not recommended to do so for runtime issues (where fallback is usually preferred). Action items: (a) check what latest messageformat.js do (which is the library we depend on for ICU MessageFormat support) 2) Plural rules are defined by CLDR at https://github.com/unicode-cldr/cldr-core/blob/master/supplemental/plurals.json and http://www.unicode.org/reports/tr35/tr35-numbers.html#Language_Plural_Rules. Action items: It's worth to read the spec and understand what's the expected behavior for negative counts (I believe it assumes positive); in case negative counts are expected, we can check latest messageformat.js; Depending on the outcome, we could bump up messageformat.js, file an issue there, or file an issue in CLDR. :)

nbouvrette commented 5 years ago

I found more evidence that my second point is a bug:

The plural category for negative numbers is calculated according to the absolute value of the source. (This may change in the future, if we find languages that have different behavior.)

Source: https://www.unicode.org/reports/tr35/tr35-numbers.html#Language_Plural_Rules