globalizejs / globalize

A JavaScript library for internationalization and localization that leverages the official Unicode CLDR JSON data
https://globalizejs.com
MIT License
4.8k stars 604 forks source link

Access to listPatterns? #602

Open Strate opened 8 years ago

Strate commented 8 years ago

Is there any way to access listPatterns data from globalize? If yes, how? If no, which excpected api could be used? I think about something like:

listPatternFormatter(type: "standard" | "unit" | "unit-narrow" | "unit-short") => (items: Array<string>) =>  string
rxaviers commented 8 years ago

Hi @Strate,

The direct answer to your question is yes, you can access this data by using our cldr-data proxy Globalize.cldr or globalize.cldr (where globalize is a Globalize instance) and use .cldr.main('listPatterns/listPattern-type-standard/start') for example (considering you have loaded the appropriate JSON file main/<bundle>/listPatterns.json).

Having said that, I assume you want to use that data to properly format lists (e.g., "Monday, Tuesday, Friday, and Saturday") and Globalize doesn't provide such List formatter at the moment. I believe the proper solution is implementing one. Would you be interested to do so?

The first step is to define the API and a good baseline is: https://github.com/tc39/ecma402/issues/33

Strate commented 8 years ago

Hi @rxaviers,

Accessing to Globalize.cldr is not the answer, because I want to get formatter, which could be compiled with globalize-compiler at least, because I do not have access to cldr from runtime.

I think I'm ready to start work on PR.

I reviewed the tc39/ecma402#33 and I am confused. Take a look to http://www.unicode.org/cldr/charts/28/summary/ru.html#7080, there is regular, duration, short duration, and narrow duration list patterns. But at https://github.com/unicode-cldr/cldr-misc-full/blob/master/main/ru/listPatterns.json#L12 there is standard, unit, unit-short, and unit-narrow patterns. Which one should be exposed to api? I am really don't know how to make a choice here.

Usage algorithm is pretty clear and well described here: http://cldr.unicode.org/development/development-process/design-proposals/list-formatting, so, it should not be hard to implement.

rxaviers commented 8 years ago

Accessing to Globalize.cldr is not the answer

Exactly, that's why I answered to you question (please re-read it) and also added my assumption that you would want to use it as a formatter, which turns out to be correct. Please, correct me on any confusion.

I think I'm ready to start work on PR.

Excellent. I'm replying to your other questions below...

Usage algorithm is pretty clear and well described here: http://cldr.unicode.org/development/development-process/design-proposals/list-formatting, so, it should not be hard to implement.

Yeap, great reference. Also consider this up-to-date CLDR reference http://www.unicode.org/reports/tr35/tr35-general.html#ListPatterns.

I reviewed the tc39/ecma402#33 and I am confused. Take a look to http://www.unicode.org/cldr/charts/28/summary/ru.html#7080, there is regular, duration, short duration, and narrow duration list patterns. But at https://github.com/unicode-cldr/cldr-misc-full/blob/master/main/ru/listPatterns.json#L12 there is standard, unit, unit-short, and unit-narrow patterns. Which one should be exposed to api? I am really don't know how to make a choice here.

I believe you can ignore any duration specific modifier for now. CLDR 24 has included this extra data for formatting durations. Note that the duration formatter re-uses this list formatter.

"In addition, under , three new types of may be provided: "unit" (for use with long units), "unit-short", and "unit-narrow". These are intended to be used for constructing formats such as “2 hours, 5 minutes”, “2 hrs, 5 mins”, or “2h 5m”. If separate data for these are not provided, the standard will be used. [#5997]." (full changelog)

Please, just let me know on any questions.

Strate commented 8 years ago

The last one. New function will have this signature:

listPatternFormatter(type: "standard" | "unit" | "unit-narrow" | "unit-short") => (items: Array<string>) =>  string

Is it ok?

rxaviers commented 8 years ago

LGTM :+1:, feel free to submit a PR with work in progress implementation for faster feedback and feel free to ask further questions if you have any.

PS: I have considered proposing a change to your proposal by splitting type (standard|unit) from form (|narrow|short), like the below, but after reading UTS#35 better and looking at the data itself I think your suggestion above is great, because such distinction isn't that clear in the current CLDR data. So, yeap having type only seems fine to me.

listPatternFormatter({type: "standard" | "unit", form: undefined | "narrow" | "short" })
   => (items: Array<string>)
   => string
SlexAxton commented 6 years ago

I implemented a small listPattern formatter, happy to adjust it to work well as a dependency for globalize as well:

https://www.npmjs.com/package/cldr-listpattern

rxaviers commented 6 years ago

Hi @SlexAxton, this is great. I believe globalize could depend on it except for the precompiled version. I'm trying to figure out how globalize compiler could support it too... There's one thing (at least) that it would be good the compiler to do that is to slice CLDR to the minimum per usage basis. Compiler currently does that on other functionalities (e.g., date, number, etc) by listening to cldr-js get events (since cldr-js is used under the hoods in these modules to pick CLDR data) and then compiler can re-construct the minimum data slice. If cldr-listpattern used cldr-js too, we would be good. Although, I realize that a dependency on cldr-js could be overkill for simply accessing object properties.