tc39 / ecma402

Status, process, and documents for ECMA 402
https://tc39.es/ecma402/
Other
540 stars 108 forks source link

Intl.NumberFormat missing petabit #755

Open markov00 opened 1 year ago

markov00 commented 1 year ago

I've seen that there the petabit unit is missing. The petabyte is present but not petabit.

console.log(new Intl.NumberFormat('en', { notation: "compact" , compactDisplay: "long", style: 'unit', unit: 'petabit' }).format(1)); 
// Invalid unit argument for Intl.NumberFormat() 'petabit'

console.log(new Intl.NumberFormat('en', { notation: "compact" , compactDisplay: "long", style: 'unit', unit: 'petabyte' }).format(1)); 
// 1 PB

Looking at the https://github.com/unicode-org/cldr/blob/main/common/validity/unit.xml is missing there too and it looks like there is a workaround (not implemented in Intl.NumberFormat) to use just the SI prefixes: https://unicode-org.atlassian.net/browse/CLDR-13167

What can be done to add this unit? can be done directly in Intl.NumberFormat or the change/addition should come from Unicode/CLDR?

ryzokuken commented 1 year ago

@markov00 a bit of both, really. On the Intl side, all "sanctioned" unit identifiers are listed out in this table: https://tc39.es/ecma402/#table-sanctioned-single-unit-identifiers. That said, even if we add the unit to this table, implementers would depend on native Intl libraries like ICU in order to actually implement said unit. Also, CLDR support is essential for any new addition to ECMA-402 overall. Therefore, I'd say that let's first address on the CLDR issue and perhaps even wait on the ICU resolution before proposing this in ECMA 402.

sffc commented 1 year ago

It may or may not be the case that petabit is supported in CLDR, but we are more constrained in adding units to Intl.NumberFormat. Please read https://github.com/tc39/proposal-unified-intl-numberformat/issues/39 for why we don't just include all units.

I created a label "new unit" for all the issues filed here for new units, since this is a common request.

srl295 commented 1 year ago

It may or may not be the case that petabit is supported in CLDR, but we are more constrained in adding units to Intl.NumberFormat. Please read https://github.com/tc39/proposal-unified-intl-numberformat/issues/39 for why we don't just include all units.

I created a label "new unit" for all the issues filed here for new units, since this is a common request.

Can you comment on the CLDR ticket as to whether it's needed?

markov00 commented 1 year ago

Thanks, everybody for the comments. I'm wondering if I understood exactly the reasons why not all the units are present: it looks like a data size problem that will fallback on the browser size. Have I correctly understood that? I don't know the intricate dynamics of the various browser vendors and standard organizations, so pardon me if I'm saying something silly or obvious, or nonsense. I'm trying to understand why an additional payload of 1713 KiB (if I understood it correctly) to the browser size creates such an opinionated selection of possible units. On a Mac the last chrome download size is ~209MB, FF is 126MB, and a few MB feels negligible. In particular in the following context: the browser download is a one-time operation, having instead to download a JS library to handle number formatting in various languages on each website/app that needs it looks way more impactful in terms of energy, bandwidth etc. My feeling toward Intl project was that it will eventually replace the need for general purposes formatting/utils libraries, reducing the need of jumping between unmaintained Date/Time/Number projects and having a well-defined, standard, and consistent way to handle these utilities, but the size limitation looks like an impactful limit to this purpose.

I'm also curious about how choices are made on the "sanctioned" list, from the various comments seems that the most common units were chosen but I have doubts that a mile-scandinavian is so common, also on the spreadsheet it is marked as informal and common and only in SE.

I'm also wondering what was the reason to adopt full units like kilogram, megabit, milliliter instead of adopting, as suggested in the CLDR issue, a strategy with SI prefixes like: kilo,mega,milli and connect them to units like gram, bit,liter. Do you think this approach could help in solving a bit the data sizing problem?

ryzokuken commented 1 year ago

The discussion about the CLDR data sizes and optimization strategies, although useful and one that affects a couple of decisions from time to time, is ultimately an implementation detail and I'm not entirely sure if we can (or should) do anything on the spec side to affect that.

sffc commented 1 year ago

I think it's likely that SI prefixes could be added.

In the standards body, we don't set policies on what payload size is justifiable by browser implementations. You can see that the original units proposal was to include all CLDR units, but we backpedaled due to implementer feedback. So the right forum to have that discussion is with them, not with the spec authors.

There are a few small-usage units we include, including stones for person weight and mile-scandenavian for road distance, because we decided that instead of picking individual units based on individual popularity, we would pick them based on use cases. So, for example, Intl supports formatting of road distance in all locations. "i18n correctness" is always our first priority.