pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.6k stars 963 forks source link

Support declension i18n marks in the source #6798

Open webknjaz opened 5 years ago

webknjaz commented 5 years ago

Bad source in Weblate

Some languages use declension depending on numerals and genders in sentences. So far, I've seen only one instance of a properly marked string (allowing to enter multiple translated variants for it, depending on the number) and a lot of unmarked ones.

AFAIU, PyPI doesn't hold any user gender info which means that translators would have to work around this by trying to apply some generic structures (which will make the translation weirder than in could be, and probably longer, not fitting the UI). We cannot do anything about it right now, I just wanted to document it.

But the situation with the numbers is better. We know what numbers are being rendered. So the solution is simple: find and mark all the string with numbers in them which would enable translators to enter custom variants depending on the digit ranges.

nlhkabu commented 5 years ago

Thanks @webknjaz. Could you tell us, where is the example where this is done correctly? This will give us a model for updating the other strings.

webknjaz commented 5 years ago

So here's one in the UI: https://hosted.weblate.org/translate/pypa/warehouse/uk/?checksum=ea1ff03f84cf42f2. It's coming from warehouse/templates/email/password-reset/body.html:22 and warehouse/templates/email/verify-email/body.html:22.

It suggests "One" (numbers ending digit 1), "Few" (ending with 2, 3, 4) and "Other" (rest of the number endings).

It probably shows up properly because of {% pluralize %} used in the templates. I don't know if there are other options for marking strings with numbers. I imagine that there should be ways of marking each rendered variable as a number because if there's multiple numbers in a string, it creates more translation combos.

steffenschroeder commented 5 years ago

Hi, the problem with the declinations gets even worse if the word's gender depends on a placeholder. Like <a href="%(href)s">%(username)s</a> removed as project %(role_name)s from https://github.com/pypa/warehouse/blob/master/warehouse/templates/manage/history.html#L55

I see that this works wonderful in English, but it's translatable to german because the of the following:

English German
project maintainer Projektbetreuer
project collaborator Projektmitarbeiter

Maybe some background: I'm working as an engineer for SAP building business software which is translated in 100+ languages. All texts are initially written in English and translated to the different languages.

Every now and then, there is some pushback from the Language Team (which checks english grammar, the use of right/consistent terms and if texts can be translated at all). If an english original text is not translatable, the usual solution to this is changing the original english texts.

So one solution for cases like https://hosted.weblate.org/translate/pypa/warehouse/uk/?checksum=ea1ff03f84cf42f2 (Singular: This link will expire in %(n_hours)s hour., Plural: This link will expire in %(n_hours)s hours.) could be to have a fixed set of possible values and multiple messages for this, like:

Or we could say:

So long story short: if the English original texts are hard/impossible to translate, we might want to change them instead of enhancing the tooling to workaround the issue.

webknjaz commented 5 years ago

@steffenschroeder I think, in such cases, it's fine to restructure the translation itself too. The main point is that you should keep the original meaning...