Support algorithmic (RBNF) numbering systems: native, traditio, finance

littledan commented 8 years ago

I noticed this bug languishing in the old tracker, and I was wondering if it still needs to be addressed https://bugs.ecmascript.org/show_bug.cgi?id=692 . Any thoughts? @jshin @caridy

Support numbering systems "native", "traditio", and "finance".

CLDR 21 added numbering systems "native", "traditio", and "finance". These are not mentioned in section 3 of UTS 35, but show up in the bcp47 data files. They seem inappropriate as identifiers, and so should not be processed as such. Edition 1.0 of the ECMAScript Internationalization API specifies to ignore them. In a future edition, we could accept them in requests but canonicalize them to actual numbering system identifiers for format and resolvedOptions.

EDIT (2020-07-12): Since this bug involves linking Rule-Based Number Format (RBNF), the following styles should also be supported:

Spellout: "one hundred twenty"
Ordinal: "1st, 2nd, 3rd"

EDIT (2020-08-10): Spellout and ordinal are now being tracked in #494.

rxaviers commented 8 years ago

ray007 commented 7 years ago

On a related note, there currently seems to be no support for any numbering system classed as "algorithmic". Any chance of getting that?

littledan commented 7 years ago

@ray007 Can you say more about your use case?

ray007 commented 7 years ago

My program is only a processor for customer projects. We now offer localized number formatting, but have to limit the selection to the browser-supported subset.

littledan commented 7 years ago

@ray007 Do you know more about what your customers will need from number formatting?

ray007 commented 7 years ago

Experience says: they'll use everything we offer and will ask for more. I know we do have customers in china and japan, so non-latin numbers for there will definitely be missed.

littledan commented 7 years ago

Is there any way you can ask your customers which features are important to them that are missing? So far, we've shied away from just exporting everything that's in CLDR.

ray007 commented 7 years ago

IMHO exporting everythings that's in CLDR sounds like a good idea to me.

I can ask. The company I work for also sells hardware, that displays the same customer-produced projects with an application written in C/C++. That application has full ICU option support, so I do get asked why I don't support the same features. It's a problem for me with NumberFormat and DateTimeFormat.

zbraniecki commented 7 years ago

IMHO exporting everythings that's in CLDR sounds like a good idea to me.

I don't think we can do this in the Web context, because CLDR changes over time and we need APIs that don't break over time.

That application has full ICU option support, so I do get asked why I don't support the same features.

That's great. Sounds like you can just point out the missing features that are needed. If I understand, that's exactly what @littledan asked you. But we need more specific answer than "It's a problem for me with NumberFormat and DateTimeFormat."

ray007 commented 7 years ago

I don't think we can do this in the Web context, because CLDR changes over time and we need APIs that don't break over time.

The data in the CLDR may change over time, the API to retrieve it should not.

That's great. Sounds like you can just point out the missing features that are needed. If I understand, that's exactly what @littledan asked you.

I'll do so for DateTimeFormat in a separate issue, for NumberFormat things immediately coming to mind:

no option to force displaying a sign also for positive numbers
number styles offered in our C-application are:
- decimal
- scientific
- compact short
- compact long
- percent
- ordinal
- spellout
- currency
- currency iso
no algorithmic numbering systems

I could think of even more useful styles, but that's what we have now. I did an implementation for scientific on my own, but even that only works for western locales.

ray007 commented 7 years ago

I'm using DateTimeFormat to generate a formatter like globalize.js dateFormatter. Trying to do so, I was not able to support all formatters

formatter "Y" partially wrong
formatter "U" not supported
quarter ("q", "Q") not supported
formatter "W" and "w" not supported (week on month/year)
no icu "short" format (i.e. 6-letter weekday pattern)
only am/pm for day period
hours: the hourCycle proposal looks good
lots of options missing for timezone
no distinction standalone/format when getting value for single date/time part

rxaviers commented 7 years ago

no option to force displaying a sign also for positive numbers

Seems a valid proposal to me… Needs a champion.

number styles offered in our C-application are:

decimal

scientific

percent

currency

currency iso

Those are supported. Are you missing anything in particlar?

compact short

compact long

Please refer to #37

ordinal

See #34

spellout

no algorithmic numbering systems

I understand what you say here, but in order to sell this spec addition to the comittee it's required that we provide (among others) appealing use cases. Since you miss this in Ecma, can you help to provide those use cases with rich details?

For example, nothing occures me for spellout; but for algorithmic nus, you can create a list the locales that would be better supported by enabling their native nu (http://www.unicode.org/repos/cldr/trunk/common/bcp47/number.xml).

Then, such proposals would also require a champion.

ray007 commented 7 years ago

Currency works, though other than the ICU API, it doesn't take the currency to use from the locale if not given.

Agreed about compact and ordinal.

While I'm sure some customer will ask, spellout seems the least useful style to me.

The algorithmic numbering systems we're sure to be missing are hans, hant and jpan for our customers in china and japan.

Style scientific is not supported. And my simulation only works well with western locales.

rxaviers commented 7 years ago

Currency works, though other than the ICU API, it doesn't take the currency to use from the locale if not given.

Ok. I think you're asking for u-cu extension support... Something like this right? new Intl.NumberFormat('en-u-cu-USD', {style: 'currency'}).format(1) // '$1.00'.

rxaviers commented 7 years ago

@ray007 can you create issues individually for each point you said above?

rxaviers commented 7 years ago

@littledan @zbraniecki, supporting algorithmic numbering systems seems a valid requirement for this issue here, because some locales have algorithmic numbering system in their "native", "traditio", or "finance" nus.

The Japan example @ray007 provided has this:

        "defaultNumberingSystem": "latn",
        "otherNumberingSystems": {
          "native": "latn",
          "traditional": "jpan",
          "finance": "jpanfin"
        },

Note that jpan and jpanfin are algorithmic based numbering systems http://www.unicode.org/repos/cldr/trunk/common/bcp47/number.xml.

ray007 commented 7 years ago

Currency works, though other than the ICU API, it doesn't take the currency to use from the locale if not given.

Ok. I think you're asking for u-cu extension support... Something like this right? new Intl.NumberFormat('en-u-cu-USD', {style: 'currency'}).format(1) // '$1.00'.

Yes - but for one difference: in the icu api, if you use locale "en_US", it automatically selects "USD" for currency.

rxaviers commented 7 years ago

Let's discuss different topics in their to-be-created issues...

zbraniecki commented 7 years ago

The data in the CLDR may change over time, the API to retrieve it should not.

Which means that for each and every addition we need to carefully spec a forward-compatible API, create a spec proposal, find a champion, move it through stages and implement in engines.

That's very different from saying "IMHO exporting everythings that's in CLDR sounds like a good idea to me.", so please, be very specific when you request features, separate each feature you request in it's own issue to make it easier to discuss each one in isolation, give examples of what such feature would enable and so on.

Stating that "all CLDR should be exposed" is not helpful.

littledan commented 7 years ago

"Exposing all of CLDR" is interesting to understand as a goal. If each of these features, one by one, is motivated mostly by its presence in ICU, then that's useful in evaluating it. I'd prefer if we had more information about where programmers (for example, your customers) want to use these things, which I don't see in some of the newly filed bugs. It shouldn't be enough to write something up; it needs to be justified as well.

ray007 commented 7 years ago

I can't give any better wishes right now since I don't know what the customers will to with the new localization options we released just last week. I already do know that our Chinese customers will miss the "cyclic year names" in the date formatter. I'm also quite sure I'll get some complaints for scientific number formatting in non-western locales. Anything else specific, I'll open a new issue here once I know...

caridy commented 7 years ago

any volunteer to champion this?

sffc commented 4 years ago

Algorithmic numbering systems are useful, but additional use cases are important to prioritize this relative to other ongoing work in the ECMA-402 specification. Adding algorithmic numbering systems would add quite a bit of required complexity to the implementation side (even though ICU supports it, rule-based number formatting is a substantial bit of code that non-ICU implementors would need to start carrying).

binyamin commented 4 years ago

@sffc Here's my use-case: the Hebrew numbering system doesn't work for Date().toLocaleDateString() in the browser. If I wanted to display dates to an Israeli audience, it would require a fair amount of extra client-side JavaScript in order to swap out the latin characters for Hebrew ones. Also, it works in Node.js and it's valid Unicode, which will confuse programmers.

I see that this is began in 2016. What can I do to help move this proposal along?

ljharb commented 4 years ago

@binyamin does https://github.com/tc39/proposal-temporal solve your concerns?

binyamin commented 4 years ago

@ljharb maybe. Does it work with algorithmic numbering systems (eg. he-IL-u-nu-hebr)?

Edit: Temporal relies on Intl, which doesn't support hebrew numbering systems either.

sffc commented 4 years ago

Thanks for the use case!

Temporal provides Hebrew calendar arithmetic, but it doesn't provide the algorithmic numbering system -u-nu-hebr.

devongovett commented 3 years ago

FWIW, we've been working on a number input that supports localized numbers. We currently support the hanidec numbering system since that's the only Chinese numbering system Intl.NumberFormat supports (despite MDN docs to the contrary). However, this led to some confusion: https://twitter.com/kourge/status/1390352734784688132

twenty-one in chinese is 二十一 (literally “two ten one”) and twelve is 十二 (lit “ten two”)

whereas hanidec outputs 二一 - literally the digit for 2 and the digit for 1.

https://twitter.com/kourge/status/1390357371642277890

Han decimal is for things that represent numeric IDs and not quantities.

After we determined that browsers don't currently support algorithmic numbering systems, I found this issue. Since a number input is typically meant for entering quantities (e.g. incrementable numbers), I think this is a good use case for supporting algorithmic numbering systems.

Since I am here I'll also say that parsing support, even if advanced formatting isn't supported, would be useful. We were able to implement a parser using the data available from Intl.NumberFormat by generating a map between digits and ascii (blog post about our approach), but for algorithmic numbering systems this would be much more difficult. We'd end up reimplementing the rules defined in CLDR in JS, which would have a lot of overhead. This is probably a separate issue though.

kourge commented 3 years ago

I'd like to add that Chinese calendars and ordinals are also use cases for algorithmic numbering. Although it's possible to use a decimal numbering system for both of these, character-based representations are just as common in the wild:

April 21: 4 月 21 日, 四月二十一日
21st: 第 21, 第二十一

Furthermore, the agricultural lunar calendar uses a modified version of the traditional numbering system, where 20 is represented as 廿 instead of 二十.

srl295 commented 1 year ago

Adding algorithmic numbering systems would add quite a bit of required complexity to the implementation side (even though ICU supports it, rule-based number formatting is a substantial bit of code that non-ICU implementors would need to start carrying).

Is it legal per spec for an implementation to support say -u-nu-hebr or others? I say so because the spec does point to the full CLDR bcp47 list including all systems.

sffc commented 1 year ago

ECMA-402 currently assumes/requires that the numbering systems be in decimal because of the way it handles grouping and decimal separators. It's possible that a lighterweight proposal could be to rewrite the PartitionNotationSubPattern AO to be more relaxed about this so that algorithmic numberingSystem options could be allowed.

https://tc39.es/ecma402/#sec-partitionnotationsubpattern

However, the implementation requirement to include RBNF would need to be discussed. I'm not sure we can get buy-in from implementers to require RBNF until everyone supports it (e.g. ICU4X), and likewise I don't know if we can get buy-in to merely allow it due to web compatibility concerns if some browsers ship it but not others.

srl295 commented 1 year ago

@sffc thanks for clarifying, even if disappointing.

I will note that is not a great reason to require it, especially since grouping and decimal aren't needed for, say, date and year formatting which is a high runner use case here. the rewrite makes some sense.

I'd also note that for broader implementation, it might be good for CLDR to first implement the spec change in https://unicode-org.atlassian.net/browse/CLDR-10884

tc39 / ecma402

Support algorithmic (RBNF) numbering systems: native, traditio, finance #95