Closed patch closed 8 years ago
The cldr28
branch has been pushed:
https://github.com/patch/cldr-number-pm5/tree/cldr28
https://github.com/patch/cldr-number-pm5/compare/cldr28
Here is the failing test output:
t/00-load.t ............ 1/1 # CLDR::Number v0.12, Moo v2.000002, Perl v5.16.2 (/usr/bin/perl)
t/00-load.t ............ ok
t/currency.t ........... ok
t/format.t ............. ok
t/from-icu4c.t ......... ok
t/from-shutterstock.t .. 1/59
# Failed test '1000 CHF in en-CH'
# at t/from-shutterstock.t line 17.
# got: 'CHF 1.000,00'
# expected: 'CHF 1,000.00'
# Failed test '1000 DKK in en-DK'
# at t/from-shutterstock.t line 17.
# got: '1.000,00 kr.'
# expected: 'DKK 1,000.00'
# Failed test '1000 EUR in de-AT'
# at t/from-shutterstock.t line 17.
# got: '€ 1 000,00'
# expected: '€ 1.000,00'
# Failed test '1000 EUR in en-AT'
# at t/from-shutterstock.t line 17.
# got: '€ 1.000,00'
# expected: '€1,000.00'
# Failed test '1000 EUR in en-DE'
# at t/from-shutterstock.t line 17.
# got: '1.000,00 €'
# expected: '€1,000.00'
# Failed test '1000 EUR in en-NL'
# at t/from-shutterstock.t line 17.
# got: '€ 1.000,00'
# expected: '€1,000.00'
# Failed test '1000 SEK in en-SE'
# at t/from-shutterstock.t line 17.
# got: '1 000,00 kr'
# expected: 'SEK 1,000.00'
# Failed test '1000 USD in zh-CN'
# at t/from-shutterstock.t line 17.
# got: 'US$1,000.00'
# expected: 'US$ 1,000.00'
# Looks like you failed 8 tests of 59.
t/from-shutterstock.t .. Dubious, test returned 8 (wstat 2048, 0x800)
Failed 8/59 subtests
t/from-twittercldr.t ... 1/22
# Failed test 'use the currency symbol for the corresponding currency code'
# at t/from-twittercldr.t line 44.
# got: 'THB 12.00'
# expected: '฿12.00'
# Looks like you failed 1 test of 22.
t/from-twittercldr.t ... Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/22 subtests
t/from-uts35.t ......... ok
t/inf-nan.t ............ ok
t/inheritance.t ........ 1/15
# Failed test 'currency sign inherited from en-001'
# at t/inheritance.t line 38.
# got: 'JPY'
# expected: 'JP¥'
# Failed test 'locale inheritance'
# at t/inheritance.t line 41.
# +----+------------+----+-----------------+
# | Elt|Got | Elt|Expected |
# +----+------------+----+-----------------+
# | 0|[ | 0|[ |
# * 1| 'ms-SG', * 1| 'ms-Latn-SG', *
# | | * 2| 'ms-Latn', *
# | 2| 'ms', | 3| 'ms', |
# | 3| 'root' | 4| 'root' |
# | 4|] | 5|] |
# +----+------------+----+-----------------+
# Looks like you failed 2 tests of 15.
t/inheritance.t ........ Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/15 subtests
t/locales.t ............ ok
t/minmax-digits.t ...... ok
t/numbering-system.t ... ok
t/objects.t ............ ok
t/pattern-coerce.t ..... ok
t/pattern-trigger.t .... ok
t/quoting.t ............ ok
t/rounding.t ........... ok
Test Summary Report
-------------------
t/from-shutterstock.t (Wstat: 2048 Tests: 59 Failed: 8)
Failed tests: 7, 11, 13, 16, 18, 22, 41, 59
Non-zero exit status: 8
t/from-twittercldr.t (Wstat: 256 Tests: 22 Failed: 1)
Failed test: 17
Non-zero exit status: 1
t/inheritance.t (Wstat: 512 Tests: 15 Failed: 2)
Failed tests: 9-10
Non-zero exit status: 2
Files=17, Tests=498, 3 wallclock secs ( 0.13 usr 0.05 sys + 2.52 cusr 0.30 csys = 3.00 CPU)
Result: FAIL
No surprises here. Most of these are because of the addition of new English locales to support Europe. For example, we added en_DK in this release, so you would certainly expect to see "1.000,00 kr." for the local currency in that locale, just as you would in Danish.
Thanks for the feedback, John!
Other than updating the unit tests, I had to perform one code change to support the v28 data. Although CLDR::Number was handling single quotes for literal sequences in decimal
, percent
, and currency
patterns, it was not for atLeast
or range
patterns. The new data just introduced single quotes in the range
patterns for the es_CO
and es_GT
locales, initially producing formatted ranges like 'de' 1 'a' 5
and 1 'al' 5
before the fix.
UTS #35 is not clear about supporting single quotes in Part 3: 2.5 Miscellaneous Patterns and that any of the rules later introduced in Part 3: 3 Number Format Patterns also apply to those patterns.
Note that there are other range
and atLeast
patterns that do not include quoted words:
locale | type | pattern |
---|---|---|
da |
atLeast |
{0} eller derover |
es |
atLeast |
Más de {0} |
fa |
range |
{0} تا {1} |
fi |
atLeast |
vähintään {0} |
fo |
atLeast |
{0} ella meira |
fr |
atLeast |
au moins {0} |
ja |
atLeast |
{0} 以上 |
lv |
atLeast |
vismaz {0} |
smn |
atLeast |
ucemustáá {0} |
Here are the new ones in question:
locale | type | pattern |
---|---|---|
es_CO |
range |
'de' {0} 'a' {1} |
es_GT |
range |
{0} 'al' {1} |
Even if the quotes are officially supported in range
and atLeast
, it seems like the official CLDR data should be consistent about their use. My guess is that some other CLDR-based libraries may run into this issue as well. I only noticed it from reviewing a diff of all the data changes.
I would agree that we need to be consistent about this and make sure it is documented accordingly. I would suggest that you file a CLDR ticket at http://unicode.org/cldr/trac/newticket. At a minimum, we should be able to get UTS #35 updated before we publish in a couple of weeks.
I filed a CLDR ticket for this issue: http://unicode.org/cldr/trac/ticket/8928
Either way, the cldr28
branch of CLDR::Number now supports those quotes and I'm closing this issue here because I think we're ready for CLDR v28 when it's released. We just need to rerun bin/generate-cldr-data.pl and document any major changes (en_150
inheritance, new numbering systems, new locales, etc.) in the Changes file.
For the record, here are the CLDR JSON files that we currently use for this project:
From @JCEmmons: