python-babel / babel

The official repository for Babel, the Python Internationalization Library
http://babel.pocoo.org/
BSD 3-Clause "New" or "Revised" License
1.34k stars 448 forks source link

babel.lists.format_list with style="unit" errors on non-English languages #1098

Closed Vexed01 closed 4 months ago

Vexed01 commented 4 months ago

Overview Description

babel.lists.format_list does not work with non-English locales, when style is set to "unit".

Steps to Reproduce

Install latest babel 2.15.0

from babel.lists import format_list
print(format_list([1, 2, 3, 4], style="unit", locale="zh_CN"))

Actual Results

Traceback (most recent call last):
  File "<eval command - snippet #20>", line 4, in func
    print(format_list([1, 2, 3, 4], style="unit", locale="zh_CN"))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/redenv/lib/python3.11/site-packages/babel/lists.py", line 89, in format_list
    result = patterns['start'].format(lst[0], lst[1])
             ~~~~~~~~^^^^^^^^^
  File "/home/ubuntu/redenv/lib/python3.11/site-packages/babel/localedata.py", line 234, in __getitem__
    orig = val = self._data[key]
                 ~~~~~~~~~~^^^^^
KeyError: 'start'

Expected Results

Something like 1, 2, 3, 4 but locale dependent of course!

Reproducibility

I can reproduce this in every non-Engilsh language I've tried, es_ES zh_CN fr_FR Various EN locales en_GB en_US en_CN produce the expected result 1, 2, 3, 4

Additional Information

Similar to #781 but this error occurs with supported styleunit and has a wider language scope

akx commented 4 months ago

The given locale doesn't have the data required to format these patterns.

zh only has unit-short patterns, not unit:

https://www.unicode.org/cldr/charts/45/summary/zh.html#28f7e602f689c61b

See the spec for a discussion of the various list types.

akx commented 4 months ago

... on second thought, we should then be getting the ValueError for a missing pattern family...

akx commented 4 months ago

Ah yeah, this is a bona fide bug – it's actually #1076 :)