godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
91.33k stars 21.24k forks source link

Godot internationalization system has an API design error regarding pluralization #99197

Open molingyu opened 1 week ago

molingyu commented 1 week ago

Tested versions

4.3

System information

any

Issue description

The two apis for pluralization in the Godot i18n system (Object.tr_n and TranslationServer.translate_plural) only allow two strings to be passed as arguments (one normal and one plural).

Of course, this api assumes that the game is developed in a language that only has two pluralization cases (I'm guessing this is English). But there are obviously a lot of languages ​​with different cases.

If a game development team chooses a language that uses more than two cases as their dev language, then this api will not work at all!

The correct api design would be to pass an array of pluralization strings here to handle languages ​​with different pluralization rules.

Reference: cldr plurals rule GNU gettext plural chapter

Steps to reproduce

N / A

Minimal reproduction project (MRP)

N / A

bruvzg commented 1 week ago

Multiple plural forms for translated string are supported (defined in the PO translation file).

The API is designed to take "untranslated" string (which is always assumed to be either language neutral ID or English). This is how every internationalization system work.

The gettext API you have linked works the same:

The ngettext function is similar to the gettext function as it finds the message catalogs in the same way. But it takes two extra arguments. The msgid1 parameter must contain the singular form of the string to be converted. It is also used as the key for the search in the catalog. The msgid2 parameter is the plural form. The parameter n is used to determine the plural form. If no message catalog is found msgid1 is returned if n == 1, otherwise msgid2.

molingyu commented 1 week ago

The API is designed to take "untranslated" string (which is always assumed to be either language neutral ID or English). This is how every internationalization system work.

In fact, he has specialized the dev language so that there can only be two plural forms.

For example

I am developing a game in Polish and I need to pass in an i18n string about the file

print(tr("%d plik", "%d pliki", 5))

So what will this return when the game language is set to Polish? It is "5 pliki", but obviously wrong! The correct one should return "5 pliko'w". In fact, Polish contains four cases.

Of course, you will say that the po file will handle more plural forms. However, I want to emphasize again that this is using Polish as the game development language. It is a source string, not a translation string. There is absolutely nothing about the po file here!

The correct design is

print(tr(["%d plik", "%d pliki", "%d plików", "%d pliku"], n))

The entire example from beginning to end has nothing to do with po (translation file). This is just the dev language selection.

In fact, the original choice of gnu gettext to only pass two parameters was limited by the times. At that time, most programs were developed in English, and only two forms were needed to work well.

But for modern game development, this is obviously not enough.

bruvzg commented 1 week ago

The correct design is

print(tr(["%d plik", "%d pliki", "%d plików", "%d pliku"], n))

This can't work like this, you need rules to select correct form based on value n, not just an array. API will be excessively complicated or unintuitive to use.

molingyu commented 1 week ago

How to return according to n is an internal matter of the API. But the premise is that you provide a correct list of choices.

If my development language has multiple plural cases, how can I provide a correct list of choices? Should I generate a corresponding po file for the development language?

I don't know why passing in a message and a plural message is intuitive. Other forms are not intuitive.

For my native language, the number does not affect the noun change. In fact, we don't need plural message at all. If I look at the API design from the same conservative perspective as you, I will think this API is very strange, and I need to pass in a completely identical message and a plural message.

In fact, if you look at the actual working principle of this API, when using the default language or when there is no corresponding translation fallback, the English rules are used directly.

which is always assumed to be either language neutral ID or English

This is completely inconsistent with the situation that you claim is irrelevant to the language.

StringName TranslationServer::translate_plural(const StringName &p_message, const StringName &p_message_plural, int p_n, const StringName &p_context) const {
    if (!enabled) {
        if (p_n == 1) {
            return p_message;
        }
        return p_message_plural;
    }

    return main_domain->translate_plural(p_message, p_message_plural, p_n, p_context);
}
StringName TranslationDomain::translate_plural(const StringName &p_message, const StringName &p_message_plural, int p_n, const StringName &p_context) const {
    const String &locale = TranslationServer::get_singleton()->get_locale();
    StringName res = get_message_from_translations(locale, p_message, p_message_plural, p_n, p_context);

    const String &fallback = TranslationServer::get_singleton()->get_fallback_locale();
    if (!res && fallback.length() >= 2) {
        res = get_message_from_translations(fallback, p_message, p_message_plural, p_n, p_context);
    }

    if (!res) {
        if (p_n == 1) {
            return p_message;
        }
        return p_message_plural;
    }
    return res;
}

"plural_rule: n != 1" (English)

In my opinion, Godot is an open game engine, rather than a conservative one that only assumes its users are English speakers.

molingyu commented 1 week ago

Sorry, I may be a little emotional.

If we think about why msg and plural_msg existed in history. Obviously, this is influenced by English. If English has three plural cases, then the people who designed the GNU gettext API may have passed in msg plural1_msg plural2_msg.

So when we want this system to work well in all languages, using an array may be a good choice.

As for how to make this system work well internally.

I think that a devlang option should be added to the editor's project settings to indicate which language the game is developed in. Then the engine implements the plural rules of different languages ​​internally (return the plural string index according to the passed n). Replace the implementation in TranslationDomain::translate_plural, and change the code that returns the English rules by default to return the rules of the corresponding language according to devlang.

Just such a change can make this new API work well.

timothyqiu commented 1 week ago

The necessity to use a Germanic language as the base language for translations was predetermined when we chose to use gettext related toolchain. This is not a design issue at API level I think.

molingyu commented 1 week ago

The necessity to use a Germanic language as the base language for translations was predetermined when we chose to use gettext related toolchain

Yes, the root of this problem comes from the historical problems of gnu gettext. I don't know at what level this problem should be counted. But this is indeed an existing problem. It is not friendly to developers of non-Germanic languages.

But, it seems that the plural design of the po file also only allows two strings to be passed in as keys. This is indeed a big trouble.