unicode-org / message-format-wg

Developing a standard for localizable message strings
Other
229 stars 33 forks source link

Review non-integral exact number selection algorithm #675

Open eemeli opened 7 months ago

eemeli commented 7 months ago

After the discussions in https://github.com/unicode-org/message-format-wg/pull/621#discussion_r1480050162 and https://github.com/unicode-org/message-format-wg/pull/621#discussion_r1490144092 as well as our live calls, we ended up leaving out an explicit definition of numerical matching for non-integer values like 0.0 or 1.5, but the text around this carries three separate notes about the current solution.

After the publication of LDML 45, we should revisit this and seek to specify the text further. We should also ensure that our solution is in line with the existing LDML Language Plural Rules, which includes a section on Explicit 0 and 1 rules. In particular, there we already have:

The explicit “0” and “1” cases apply to the exact numeric values 0 and 1 respectively. These cases are typically used for plurals of items that do not have fractional value, like books or files.

This should be accounted for in the MF2 text, and an appropriate solution here might be to expand the current Language Plural Rules section, so that in the MF2 text we can refer to it for exact number matching.

macchiati commented 7 months ago

Add:

, except in phrasing like "an average of 3.4 books per child".

On Wed, Feb 21, 2024, 04:59 Eemeli Aro @.***> wrote:

After the discussions in #621 (comment) https://github.com/unicode-org/message-format-wg/pull/621#discussion_r1480050162 and #621 (comment) https://github.com/unicode-org/message-format-wg/pull/621#discussion_r1490144092 as well as our live calls, we ended up leaving out an explicit definition of numerical matching for non-integer values like 0.0 or 1.5, but the text around this https://github.com/unicode-org/message-format-wg/blob/main/exploration/number-selection.md#determining-exact-literal-match carries three separate notes about the current solution.

After the publication of LDML 45, we should revisit this and seek to specify the text further. We should also ensure that our solution is in line with the existing LDML Language Plural Rules, which includes a section on Explicit 0 and 1 rules https://unicode-org.github.io/cldr/ldml/tr35-numbers.html#Explicit_0_1_rules. In particular, there we already have:

The explicit “0” and “1” cases apply to the exact numeric values 0 and 1 respectively. These cases are typically used for plurals of items that do not have fractional value, like books or files.

This should be accounted for in the MF2 text, and an appropriate solution here might be to expand the current Language Plural Rules section, so that in the MF2 text we can refer to it for exact number matching.

— Reply to this email directly, view it on GitHub https://github.com/unicode-org/message-format-wg/issues/675, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJLEMEVL7FHYW6UQKPQHXLYUXVTBAVCNFSM6AAAAABDTAFIIKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE2DMNZQGA3TMNY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

aphillips commented 7 months ago

I agree with having this issue. I note (as I did in the call) that exact fraction matching is a rare corner case. Plural rules work just fine with non-integer values to produce keywords (fractions, including ones like 1.0, often produce a different plural rule than integers).

I do not agree that this is plural matching. The explicit 0 and 1 cases in the quoted text are not "plural cases". The value 1 or the value 0 produce named rules to form grammatically correct strings just like any other. (In English, 0 produces other and 1 produces one) What is special about 0 and sometimes 1 is that the message needs to say something different for specifically that value. That's why I have pushed back on examples like these:

.match {$numChances :integer}
0 {{You are out of chances}}
1 {{This is your last chance}} <- works, but tools need to generate `one` for many languages
* {{You have {$numChances} chances remaining}}

.match {$numChances :integer}
0 {{You are out of chances}}
one {{This is your last chance}} <- wrong. never shows in ja, shows for values like 21 in pl
* {{You have {$numChances} chances remaining}}

in favor of:

.match {$numChances :integer}
0 {{You are out of chances}}
1 {{This is your last chance}}
one {{You have {$numChances} chance remaining}}
* {{You have {$numChances} chances remaining}}

Which brings us back to fractional matching. I can imagine cases for it:

.match {$distanceRemaining}
1.5 {{You have exactly {1.5 :number} wildebeest}}

But the more common use case is to have cutoff points, like switching units or from compact to full presentation. (This is the one thing ChoiceFormat is good at):

.match {$distanceRemaining :choice}
0 {{You have arrived}}
<0.5 {{You have {$distanceRemaining :measure unit=meter} remaining}}
<10.0 {{You have {$distanceRemaining :measure unit=kilometer display=long} remaining}}
* {{You have {$distanceRemaining :measure unit=kilometer display=compact} remaining}}