projectfluent / fluent

Fluent — planning, spec and documentation
https://projectfluent.org
Apache License 2.0
1.4k stars 45 forks source link

Misleading Canadian English Numerals #234

Closed boomshroom closed 1 year ago

boomshroom commented 5 years ago

On the Variables page of the Fluent guide book, it claims that Fluent would render 12345.678 as "12,345.678" in British and American English, but as "12 345,678" in Canadian English. As an English speaking Canadian, I have to ask where this information came from as we usually write numbers the same way as the americans. I understand that many European countries write numbers with a comma instead of a period, but that's not how it is in (at least my part of) Canada.

That said, Canada is a big country and tends to borrow a lot from other countries, so it wouldn't surprise me if there were people on the east coast who write numbers like this, in which case assigning just one localization wouldn't work anyways. Not even we are consistent with our units and spelling and tend to use American English from most services anyways.

stasm commented 5 years ago

Thank you for taking the time to file this issue. I apologize for the error. I'm not a native speaker myself and I was trying to find an example of varying number formatting rules among different variant of English in order to explain the concept by means of a single English message.

I remember that I consulted the Wikipedia page on decimal separtors, which features the following table.

Style Countries
1,234,567.89 Canada (English-speaking; unofficial), China, Hong Kong, Ireland, Israel, Japan, Korea, Malaysia, México, New Zealand, Pakistan, Philippines, Singapore, Taiwan, Thailand, United Kingdom, United States.
1 234 567.89 SI style (English version), Australia, Canada (English-speaking), China, Sri Lanka, Switzerland (officially encouraged for currency numbers only[41] ).
1 234 567,89 SI style (French version), Albania, Belgium (French), Bulgaria, Canada (French-speaking), Czech Republic, Estonia, Finland, France, Hungary, Kosovo, Latin Europe, Latvia, Lithuania, Norway, Peru, Poland, Portugal, Russia, Slovakia, South Africa, Sweden, Switzerland (officially encouraged, except currency numbers[41] ), Ukraine, Vietnam (in education).
1,234,567·89 Ireland, Malaysia, Malta, Philippines, Singapore, Taiwan, United Kingdom (older, typically hand written)[42]

I now see that Canada (English-speaking) is listed on the second row, corresponding to the 1234 567.89 formatting. I must have accidentally looked at the third row back when I was writing this section of the guide. I feel particularly silly because I even remember looking for another source to validate this data, and finding old Solaris docs which cemented the error.

Have you encountered this formatting (1 234 567.89) in your part of Canada? The same Wikipedia article has this to say about the English-speaking parts of Canada:

English Canada: There are two cases: The preferred method for currency values is $4,000.00 —while for numeric values, it is 1 234 567.89; however, commas are also sometimes used, although no longer taught in school or used in official publications.[citation needed]

stasm commented 5 years ago

Looking at the CLDR data, South African English might also be a good example here. en_ZA is listed to use , as the decimal separator and for grouping.

boomshroom commented 5 years ago

As I mentioned, we tend to be inconsistent. It's been long enough since elementary school that I forget if we were taught to use commas or spaces. Sometimes there aren't any spaces at all if it's small enough to not need them. I think what we were taught was just to use it as a readability aide and not a required part writing the number. I can say that the check I have next to me has a comma separator between the thousands and hundred space, but the cheque number has 8 consecutive digits with no spacers and has a leading 0. Similarly, 4 digit years never receive spacing.

All I know is that I very rarely see anything other than a period as a decimal point outside of international content.

aphillips commented 5 years ago

My suggestion would be: take it up with CLDR. If you think the data is incorrect for Canada, file a bug. Fluent should produce what the I18N formatting APIs produce and those generally produce what CLDR says... :-)

missmatsuko commented 5 years ago

Ah, I made a pull request to fix this (#253) before seeing this issue.

It includes a reference link to the government of Canada's official English translation guide concerning decimal formatting: https://www.btb.termiumplus.gc.ca/tcdnstyl-chap?lang=eng&lettr=chapsect5&info0=5.09

I think the reason we officially use spaces for the thousands separator is because the number could be interpreted differently in each official language if it used both periods and commas.

e.g. 1,200.000 and 1.200,000. In English, the first number is over one thousand and the second number is just over one. In French, it would be flipped (pretty sure).

If we used spaces for the thousands separator, this ambiguity is removed because we can assume the single punctuation (comma or period) is the decimal marker. e.g. 1 200, 000 is 1 200.000 and 1.200 000 is 1,200 000

I think that Oracle table is just wrong, maybe it was supposed to be just French. CLDR uses periods for decimals in Canadian English. It has commas for thousands too, but it's pretty normal to see that. Comma for the decimal is the really weird part.

stasm commented 5 years ago

Thanks for the explanation and my apologies for not fixing this earlier. I've just merged #253, and deployed it.

willfarrell commented 1 year ago

I think there is still an issue here? When using fluent.js in Chrome, Firefox, Safari, and NodeJS you get the following:

new Intl.NumberFormat('en-CA').format(12345.678)
// '12,345.678'
new Intl.NumberFormat('fr-CA').format(12345.678)
// '12 345,678'

As an English speaking Canadian, I've only seen the above notation (comma for thousand & period for decimal) throughout my entire education.

adjenks commented 1 year ago

@willfarrell The above looks correct.

Canadian English uses commas to separate thousands and dots as the decimal separator. Canadian French often uses spaces to separate thousands and commas as the decimal separator.

Here are some guides that confirm the French one:

Here is a guide from the government on writing numbers in french: https://www.btb.termiumplus.gc.ca/redac-chap?lang=eng&lettr=chapsect2&info0=2#zz2 (See section 2.4.4)

Here is a short section about numeric expression in French for English speakers in an English language guide: https://www.btb.termiumplus.gc.ca/tcdnstyl-chap?lang=eng&lettr=chapsect17&info0=17 (See section 17.05)

Here is an official guide on how to use the dollar sign in French and English: https://www.noslangues-ourlanguages.gc.ca/en/writing-tips-plus/canadian-dollar-symbol

It doesn't look like there's a problem to me based on your two examples above:

new Intl.NumberFormat('en-CA').format(12345.678)
// '12,345.678'
new Intl.NumberFormat('fr-CA').format(12345.678)
// '12 345,678'
willfarrell commented 1 year ago

In the documentation is states:

In Canadian English, however, the result would be: Time elapsed: 12 345.678s. From: https://projectfluent.org/fluent/guide/variables.html

The docs match the official documentation from the government shared above (space & period), but doesn't match any JavaScript implementation output (comma & period). Just wanted to flag that the docs don't match the output of fluent.js, and might cause confusion.

eemeli commented 1 year ago

@willfarrell You're right, the docs page is still a little bit misleading. It might be clearer to follow @stasm's suggestion from above, and use South African English as the example locale instead:

new Intl.NumberFormat('en-ZA').format(12345.678)
// '12 345,678'

Would you be willing to file a PR for fixing this page?

adjenks commented 1 year ago

The Canadian Metric Practice Guide (CAN/CSA-Z234.1-89) of the Canadian Standards Association specifies that groups of three numerals (triads) shall be separated by a space, except in the case of monetary values. It advises against the use of commas as separators. Although both commas and spaces are still widely used in Canada, The Canadian Style recommends that, except in financial documents, a space be used instead of a comma.

That quote is from this guide on writing Numerical Expressions in English in Canada: https://www.btb.termiumplus.gc.ca/tcdnstyl-chap?lang=eng&lettr=chapsect5&info0=5#zz5 (Section 5.09 Note 2))

So officially, when we write large numbers in English in Canada, we're supposed to use a space, unless it's a monetary value, in which case you're supposed to use a comma.

So, because the guide is talking about "time elapsed" and it uses spaces, it's correct.

Does the API distinguish between regular number formatting and financial/currency number formatting?

Personally being an English speaker in Canada, I don't care if you separate thousands with spaces or commas, just never use a comma to separate decimals or I'll be confused.

adjenks commented 1 year ago

To answer my own question, the Intl.NumberFormat API does distinguish different types of numbers: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/NumberFormat/NumberFormat#style

but does it do it right?

This is what my browser does for Canadian English style variants:

console.log(
  new Intl.NumberFormat('en-CA',{style:'decimal'}).format(12345.678),
  new Intl.NumberFormat('en-CA',{style: 'currency', currency: 'CAD' }).format(12345.67),
  new Intl.NumberFormat('en-CA',{style:'percent'}).format(12345.678),
  new Intl.NumberFormat('en-CA',{style: 'unit', unit:'kilometer'}).format(12345.67)
)

12,345.678 $12,345.67 1,234,568% 12,345.67 km

It doesn't seem quite right, because it's apparently supposed to differ between currency and other formats.

In fluent if one was to try to find this feature it would be found here: https://projectfluent.org/fluent/guide/functions.html#number

I am confused though because the "style" option is under a separate section called "Developer Parameters". I guess because they're sort of new features? I don't see style listed on the compatibility section of MDN or on canisuse.com. I opened a ticket because of this: https://github.com/mdn/browser-compat-data/issues/17853

willfarrell commented 1 year ago

@eemeli PR #352 opened

@adjenks Thanks for digging deeper, I was about to do more testing myself before I saw your comment update.

adjenks commented 1 year ago

I don't think your pull request aligns with the standards I've found, since spaces in non-financial documents are recommended to separate triads/thousands, but personally I don't care, Separate triads with whatever you want in Canadian English, just don't use a comma as a decimal separator and I'm happy. 😂🤣