twitter / twitter-cldr-rb

Ruby implementation of the ICU (International Components for Unicode) that uses the Common Locale Data Repository to format dates, plurals, and more.
Apache License 2.0
672 stars 93 forks source link

Incorrect ListFormatter behaviour #241

Closed valodzka closed 3 years ago

valodzka commented 3 years ago

Example:

TwitterCldr::Formatters::ListFormatter.new(:pt).format([1,2,3], :unit) # => ""
camertron commented 3 years ago

Hey @valodzka, this should work as of v6.6.1.

valodzka commented 3 years ago

@camertron Hmm, a you sure? May be you mean 6.6.2? Just tested again:

2.7.2 :001 > TwitterCldr::Formatters::ListFormatter.new(:pt).format([1,2,3], :unit)
 => ""
2.7.2 :002 > TwitterCldr::VERSION
 => "6.6.1"
camertron commented 3 years ago

Woops, sorry @valodzka! I completely forgot to pass :unit when I was testing this.

The change I made in v6.6.1 was to properly merge data from the locale's ancestor chain, but It looks like the data is potentially incomplete in CLDR. I do see inheritance markers in the data, but I think they're getting ignored because pt inherits only from root, which contains an alias for the unit type. Sadly the newly released v39 also suffers from this problem. I'll try to seek out more information and come up with a fix.

valodzka commented 3 years ago

I don't know how it handles this issue but java version that uses came CLDR handler this case correctly:

System.out.println(ListFormatter.getInstance(Locale.forLanguageTag("pt"), ListFormatter.Type.UNITS, ListFormatter.Width.WIDE).format("1", "2", "3"));
=> 1, 2 e 3
camertron commented 3 years ago

I believe ICU uses raw CLDR data, i.e. not the processed release version you can download from the CLDR website.

I think my fix is going to assume all missing values should be inherited. All the data appears to point to doing so as the author's intention anyway.

camertron commented 3 years ago

Ok, should be fixed in v6.6.2.