python-babel / babel

The official repository for Babel, the Python Internationalization Library
http://babel.pocoo.org/
BSD 3-Clause "New" or "Revised" License
1.3k stars 433 forks source link

Format checking too strict for named parameters in plural translations #661

Open gflohr opened 5 years ago

gflohr commented 5 years ago

Taken the following po file:

msgid ""
msgstr ""
"Project-Id-Version: PROJECT VERSION\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
"POT-Creation-Date: 2019-07-27 09:17+0300\n"
"PO-Revision-Date: 2019-07-27 09:18+0300\n"
"Last-Translator: Yours Truly <me@example.com>\n"
"Language: zh_Hans_CN\n"
"Language-Team: zh_Hans_CN <LL@li.org>\n"
"Plural-Forms: nplurals=1; plural=0\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.7.0\n"

#: example.py:3
#, python-format
msgid "one file"
msgid_plural "%(num)d files"
msgstr[0] "%(num)d 个档案"

This file compiles without error with msgfmt --check. pybabel compile complains about "error: locale/zh_CN/LC_MESSAGES/messages.po:23: unknown named placeholder u'num'".

Is this intentional? Not using any placeholders in the singular msgid is very common, when the base language is English because the string "1 file" looks awkward compared to "one file".

gflohr commented 5 years ago

See also https://stackoverflow.com/questions/56735601/translation-between-languages-with-different-numbers-of-plurals

miluchen commented 4 years ago

Though pybabel compile pops up an error, the generated .mo file is the same with the .mo that msgfmt generates. Can you confirm whether the generated .mo file works okay?

msgfmt does not have a "one-to-one" mapping between msgid and msgstr, while pybabel compile does. It's validating "one file" against %(num)d 个档案" and skipping the check for "%(num)d files". https://github.com/python-babel/babel/blob/2abad80c9f2fc70bde71f53ab8270ca950100595/babel/messages/checkers.py#L57-L59

gflohr commented 4 years ago

msgfmt does not have a "one-to-one" mapping between msgid and msgstr, while pybabel compile does. It's validating "one file" against %(num)d 个档案" and skipping the check for "%(num)d files".

And this is exactly the problem. It forces developers to use the awkward and arguably incorrect form "%(num)d file" for the singular form instead of the correct "one file".

Omitting the placeholder and writing out the number "one" in the singular form produces correct English and is common practice. We use it for instance in the sources of GNU gettext itself:

#: src/msgl-check.c:478
#, c-format
msgid "but some messages have only one plural form"
msgid_plural "but some messages have only %lu plural forms"
msgstr[0] "但是某些消息只有 %lu 种复数形式"

#: src/msgl-check.c:494
#, c-format
msgid "but some messages have one plural form"
msgid_plural "but some messages have %lu plural forms"
msgstr[0] "但是某些消息有 %lu 种复数形式"

In English it is merely a cosmetical problem if a program prints out "... only 1 plural form" but for example in German "... nur 1 Pluralform" is ungrammatical.