Expensify / App

Welcome to New Expensify: a complete re-imagination of financial collaboration, centered around chat. Help us build the next generation of Expensify by sharing feedback and contributing to the code.
https://new.expensify.com
MIT License
3.52k stars 2.87k forks source link

[$500] Add system for pluralizing strings to our internationalization/localization library #38614

Closed iwiznia closed 3 weeks ago

iwiznia commented 7 months ago

Problem

I've been seeing more and more copy that has singular and plural versions and what we are doing is adding 2 different translations for it, which is less than ideal for a localized app, as some languages have more than 2 forms (plural and singular) and this means we won't be able to support those languages without a major effort

Solution

Add feature to support pluralization in our localization system. How it will work is:

This way, we have a standard way of pluralizing phrases that works for all languages and avoid having to refactor all this when we decide we want to support localization for real.

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01e4da0f48a9da5a41
  • Upwork Job ID: 1770149691252797440
  • Last Price Increase: 2024-03-19
  • Automatic offers:
    • jayeshmangwani | Reviewer | 0
    • brandonhenry | Contributor | 0
    • shubham1206agra | Contributor | 103171004
    • ZhenjaHorbach | Contributor | 103705648
Issue OwnerCurrent Issue Owner: @twisterdotcom
melvin-bot[bot] commented 7 months ago

Job added to Upwork: https://www.upwork.com/jobs/~01e4da0f48a9da5a41

melvin-bot[bot] commented 7 months ago

Triggered auto assignment to Contributor-plus team member for initial proposal review - @jayeshmangwani (External)

melvin-bot[bot] commented 7 months ago

Triggered auto assignment to @MitchExpensify (NewFeature), see https://stackoverflowteams.com/c/expensify/questions/14418#:~:text=BugZero%20process%20steps%20for%20feature%20requests for more details.

melvin-bot[bot] commented 7 months ago

:warning: It looks like this issue is labelled as a New Feature but not tied to any GitHub Project. Keep in mind that all new features should be tied to GitHub Projects in order to properly track external CAP software time :warning:

ShridharGoel commented 7 months ago

Is this open for external contributors? I would like to work on it.

iwiznia commented 7 months ago

Yes, it is

ShridharGoel commented 7 months ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

Add system for pluralizing strings to our internationalization/localization library.

What is the root cause of that problem?

New feature

What changes do you think we should make in order to solve the problem?

Example code to achieve this: (This would be added in the existing getTranslatedPhrase method in Localize/index.ts, after we get the translatedPhrase value)

const count = getCountFromPhraseParameters()

if (translatedPhrase) {
    if (typeof translatedPhrase === 'function') {
        return translatedPhrase(...phraseParameters);
    }

    if (typeof translatedPhrase === 'object' && count !== undefined) {
        const pluralRules = new Intl.PluralRules(language);
        const pluralForm = pluralRules.select(count);

        if (translatedPhrase.hasOwnProperty(pluralForm)) {
            return translatedPhrase[pluralForm]
        } else {
            console.warn(`Plural form '${pluralForm}' not found for phrase '${phraseKey}'`);
            return translatedPhrase.other
        }
    }

    // We set the translated value in the cache only for the phrases without parameters.
    cacheForLocale?.set(phraseKey, translatedPhrase);
    return translatedPhrase;
}

Note: We need to discuss the way to add count in phraseParameters. We can either convert phraseParameters into an object and have different keys for count and the values, or we can keep it as an array and ensure that the first item should be the count.

ghost commented 7 months ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

Add system for pluralizing strings to our internationalization/localization library

What is the root cause of that problem?

New Feature

Here : https://github.com/Expensify/App/blob/0d700c31dfadce758d4a623789e6bd1857573131/src/components/LocaleContextProvider.tsx#L81-L86 we need to refactor the translate method to add support for Pluralisation

What changes do you think we should make in order to solve the problem?

As discussed above, we need to add do all these things to translate method :

Fyi If needed I can implement and add test branch as well.

iwiznia commented 7 months ago

@ShridharGoel I think we can use count if it is passed inside phraseParameters instead of adding a new param (which would also mean updating all usages of that method, since you are changing the param order).

Also you are not doing this correctly:

If none is found, log a warn (or throw in dev) and use the translation defined in other

ghost commented 7 months ago

hey @iwiznia ! Do you need sample code ? Just asking hope you don't mind asking me this.

ShridharGoel commented 7 months ago

Proposal

Updated CC: @iwiznia

iwiznia commented 7 months ago

hey @iwiznia ! Do you need sample code ? Just asking hope you don't mind asking me this.

It always helps.

Example code to achieve this: (This would be added in place of the existing translate method in LocaleContextProvider)

I think this is not correct? The code for translating is in Localize/index.ts...

            const count = phraseParameters[0]?.count ?? 1; // Extract count from phraseParameters or default to 1

If no count is passed, it should treat it as such, not default to 1

ShridharGoel commented 7 months ago

Proposal

Updated with code optimisation and example usage.

ShridharGoel commented 7 months ago

@iwiznia I actually made an update just now, can you have a look?

iwiznia commented 7 months ago

Did you see this?

I think this is not correct? The code for translating is in Localize/index.ts...

Also: I think these are wrong:

                  return translationEntry[pluralForm];
                  return translationEntry.other; // Fallback to 'other' form

Because if the translation has other parameters, they will not be replaced. Plus, a translation entry can be a function too.

ShridharGoel commented 7 months ago

@iwiznia Yes, it makes more sense to add in Localize/index.ts. I'll update it accordingly.

Can we also change phraseParameters to accept such an object?

{
  "count": 2,
  "values": []
}

Else, we can keep it as an array and make count to be the first element always.

brandonhenry commented 7 months ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

Add system for pluralizing strings to our internationalization/localization library to support languages with more than two pluralization forms.

What is the root cause of that problem?

New Feature The current getTranslatedPhrase function in Localize/index.ts does not handle pluralization rules for languages with more than singular and plural forms. It needs to be enhanced to support the diverse pluralization categories used by different languages.

What changes do you think we should make in order to solve the problem?

I propose the following changes to the getTranslatedPhrase function in Localize/index.ts:

Handling of translated phrases and count parameter

- If the translated phrase is a string, it is cached and returned immediately. This is the same behavior as the original function. - If the translated phrase is a function, it is evaluated with the provided parameters, including the count parameter. The count parameter is passed as a separate argument to the translation function. This allows the translation function to destructure the count parameter and use it in the translation. - If the translated phrase is an object, it is assigned to the phrase variable for further processing. This change was made to handle pluralization, where the translated phrase is an object containing different pluralization forms. ```jsx if (translatedPhrase) { if (typeof translatedPhrase === 'string') { cacheForLocale?.set(phraseKey, translatedPhrase); return translatedPhrase; } let phrase: string | Record; if (typeof translatedPhrase === 'function') { phrase = translatedPhrase(...phraseParameters); } else { phrase = translatedPhrase; } if (typeof phrase === 'string') { return phrase; } // ... } ```

Handling of phrase variable

If `phrase` is a string, it is returned as the final translated phrase. This is a new check added to handle the case where the translated phrase is an object, but the evaluated result (`phrase`) is a string. ```jsx if (typeof phrase === 'string') { return phrase; } ```

Pluralization handling

- The `count` is extracted from `phraseParameters` (if provided) to determine the pluralization form. - The `Intl.PluralRules` instance for the current language is memoized to avoid creating multiple instances. This improves performance by reusing the same instance for subsequent calls with the same language. - The pluralization category is determined based on the count and language rules using `Intl.PluralRules`. - If the `phrase` object has a translation for the pluralization category, it is returned. If the translation is a function, it is evaluated with the provided parameters before returning. This allows for dynamic translations based on the pluralization form. - If no translation is found for the exact category, it fallbacks to the 'other' category if it exists. This provides a fallback mechanism for languages that may not have specific translations for all pluralization forms. Since `count` can be passed in anywhere in `phraseParameters` - we need a deserializer to get count from it. ```jsx function deserializePhraseParamsForCount(phraseParameters: any[]): number | undefined { for (const param of phraseParameters) { if (typeof param === 'number') { return param; } else if (Array.isArray(param)) { const countParam = findCountParam(param); if (countParam !== undefined) { return countParam; } } else if (typeof param === 'object' && param !== null) { const countParam = findCountParam(Object.values(param)); if (countParam !== undefined) { return countParam; } } } return undefined; } ``` ```jsx const count = deserializePhraseParamsForCount(phraseParameters); if (!memoizedPluralRules[language]) { memoizedPluralRules[language] = new Intl.PluralRules(language); } const pluralCategory = memoizedPluralRules[language].select(count); if (phrase.hasOwnProperty(pluralCategory)) { const translation = phrase[pluralCategory]; if (typeof translation === 'function') { return translation(...phraseParameters); } return translation; } if (phrase.hasOwnProperty('other')) { const fallbackTranslation = phrase['other']; if (typeof fallbackTranslation === 'function') { return fallbackTranslation(...phraseParameters); } const warningMessage = `Plural form '${pluralCategory}' not found for phrase '${phraseKey}'`; Log.warn(warningMessage); if (__DEV__) { throw new Error(warningMessage); } // Use the 'other' translation as the fallback return fallbackTranslation; } ```

Error handling

- If no translation is found for the pluralization category or the 'other' fallback, a warning message is logged using `Log.warn`. - In development mode (`__DEV__`), an error is thrown with the warning message to alert developers about missing translations. - In production, the phrase key itself is returned as a fallback to prevent the application from breaking due to missing translations. ```jsx const warningMessage = `Plural form '${pluralCategory}' not found for phrase '${phraseKey}'`; Log.warn(warningMessage); if (__DEV__) { throw new Error(warningMessage); } return phraseKey; ```

Fallback language handling:

- If no translated phrase is found for the current language, the function falls back to the specified fallback language by recursively calling `getTranslatedPhrase` with the fallback language. - If no translation is found in the fallback language, it tries the default language (en) and logs an alert if the translation is not found. - ```jsx if (fallbackLanguage) { const fallbackTranslatedPhrase = getTranslatedPhrase(fallbackLanguage, phraseKey, null, ...phraseParameters); if (fallbackTranslatedPhrase) { return fallbackTranslatedPhrase; } } if (language !== CONST.LOCALES.DEFAULT) { Log.alert(`${phraseKey} was not found in the ${language} locale`); return getTranslatedPhrase(CONST.LOCALES.DEFAULT, phraseKey, null, ...phraseParameters); } ```

Return value

If no translation is found in the default language, the function returns `null` instead of the phrase key. This change was made to differentiate between a missing translation and a fallback to the phrase key itself. ``` return null; ```

Note: we could also consider extracting the pluralization logic into a separate function to keep the getTranslatedPhrase function focused on translation retrieval and caching.

With these changes, the count parameter is passed as a separate argument to the translation function, allowing it to be destructured and used in the translation. This enables the usage of dynamic translations based on the count value, as shown in the example:

someTranslation: {
    zero: `There's no transactions`,
    one: `There's one transaction`,
    many: ({count}) => `There's ${count} transactions`,
}

The getTranslatedPhrase function will automatically pass the count parameter to the translation function when it is called, so you can use it directly in the translation.

Changes

All changes can be seen in this PR here

ShridharGoel commented 7 months ago

Proposal

Updated

ShridharGoel commented 7 months ago

Because if the translation has other parameters, they will not be replaced. Plus, a translation entry can be a function too.

Can you give some examples? If it's a function, should be just call it and return the value?

bernhardoj commented 7 months ago

Proposal

Please re-state the problem that we are trying to solve in this issue.

We want to support plural localization.

What is the root cause of that problem?

New feature request.

What changes do you think we should make in order to solve the problem?

To support plural localization, we need to update the translate logic, specifically getTranslatedPhrase. We will update the logic inside this block. https://github.com/Expensify/App/blob/15aac1ca8ffbafb63346056265f544ef1e1a7a7d/src/libs/Localize/index.ts#L113-L121

translatedPhrase can either be a string, function, or an object.

if (translatedPhrase) {
    // 1. If the translation is a string, return it immediately
    if (typeof translatedPhrase === 'string') {
        // We set the translated value in the cache only for the phrases without parameters.
        cacheForLocale?.set(phraseKey, translatedPhrase);
        return translatedPhrase;
    }

    // 2. otherwise, evaluate the translation if it's a function.
    // type PhraseValue = string | Record<Intl.LDMLPluralRule, string>;
    let phrase: PhraseValue;
    if (typeof translatedPhrase === 'function') { phrase = translatedPhrase(...phraseParameters); } 
    else { phrase = translatedPhrase}

    // 3. the result from step 2 is either string or object (Record<Intl.LDMLPluralRule, string>)
    // LDMLPluralRule is either "zero" | "one" | "two" | "few" | "many" | "other"
    // if it's a string, return it
    if (typeof phrase === 'string') return phrase;

    // 4. otherwise, process the plural rules
    const pluralRules = new Intl.PluralRules(language);
    // *NOTE: this is assuming `count` property is in the first index of `phraseParameters`, but this won't satisfy the existing type. Also, what should we do when count is undefined because pluralRules.select param type is a number
    const pluralCategory = pluralRules.select(phraseParameters[0]?.count);

    // 5. if the translation doesn't contain the plural rule, log a warn on the console
    if (!phrase[pluralCategory]) {
        console.warn(errorMessageHere);
        if (__DEV__) {
            throw new Error(errorMessageHere);
        }
    }

    // 6. return the result with fallback to 'other' plural, it's possible this will return undefined if 'other' doesn't exist too.
    return phrase[pluralCategory] ?? phrase['other'];
}

Last, we need to prevent the plural translation object to be flatten. We need to add a check whether it's a plural translation or not. https://github.com/Expensify/App/blob/f4d3c18e509071ae66be73af80f1c753adf77a21/src/languages/translations.ts#L19-L22

function isPluralTranslation(data: any): boolean {
    if (typeof data !== 'object') {
        return false;
    }
    const pluralCategories = ['zero', 'one', 'two', 'few', 'many', 'other']
    return pluralCategories.some((category) => category in data);
}

if (typeof data === 'function' || Array.isArray(data) || !(typeof data === 'object' && !!data) || isPluralTranslation(data)) {

*I would like to propose changing the type of phraseParameters to be non-args instead of an args, then we can pass a new param for count as the last param.

function getTranslatedPhrase<TKey extends TranslationPaths>(
    language: 'en' | 'es' | 'es-ES',
    phraseKey: TKey,
    fallbackLanguage: 'en' | 'es' | null = null,
    phraseParameters?: PhraseParameters<Phrase<TKey>>, (the ... is removed, one downside is the type is loosen)
    count?: number,
): string | null {

Translation with plural usage example:

deleteTags: {
    one: 'Are you sure you want to delete this tag?',
    other: 'Are you sure you want to delete these tags?',
},

Localize.translate('en', 'characterLimit', undefined, 1) => Are you sure you want to delete this tag?
Localize.translate('en', 'characterLimit', undefined, 10) => Are you sure you want to delete these tags?
jayeshmangwani commented 7 months ago

Starting reviewing proposals

jayeshmangwani commented 7 months ago

@ShridharGoel have you tested your proposal locally, I am getting an issues understanding a part in your attached [code](https://github.com/Expensify/App/issues/38614#:~:text=Example%20code%20to,const%20count%20%3D%20getCountFromPhraseParameters()), translatedPhrase always be string(or null) in the existing code, right ? ,then why we are checking the typeof translatedPhrase === 'object' ?

ShridharGoel commented 7 months ago

@jayeshmangwani translatedPhrase can be an object also. If it's an object, then we need to add then new logic as mentioned in the post.

jayeshmangwani commented 7 months ago

@ShridharGoel translatedPhrase can be object in the getTranslatedPhrase function, and you are suggesting above changes in translate function, right ?

jayeshmangwani commented 7 months ago

BTW give me an example to test the where translatedPhrase is object , so I can test locally

iwiznia commented 7 months ago

Can you give some examples? If it's a function, should be just call it and return the value?

@ShridharGoel You have examples in our translation files already, search for any entry that has a method in it (search => in the translation files).

@ShridharGoel, @brandonhenry is there a reason why you are implementing it in translate and not in getTranslatedPhrase? Sounds like it makes more sense to add it there, no? Plus it handles caching and calling the function if that's needed...

@bernhardoj your proposal looks quite good, but some questions: I don't get isPluralTranslation method especially:

    const pluralCategories = ['zero', 'one', 'two', 'few', 'many', 'other']
    return pluralCategories.some((category) => category in data);

Since the data won't have that, it will just have count: 3.

*I would like to propose changing the type of phraseParameters to be non-args instead of an args, then we can pass a new param for count as the last param.

I don't think we should, I think count should be in the normal params AND should be automatically passed to the translation (if it is a method), so you can do something like:

someTranslation: {
  zero: `There's no transactions',
  one: `There's one transaction',
  many: ({count}) => `There's ${count} transactions`,
}
ShridharGoel commented 7 months ago

Proposal

Updated

iwiznia commented 7 months ago

@ShridharGoel 2 things:

brandonhenry commented 7 months ago

@iwiznia @jayeshmangwani Sorry, looks like my repo wasn't updated with main πŸ€¦πŸΏβ€β™‚οΈ seems there were updates 2 days ago

Updated proposal

bernhardoj commented 7 months ago

I don't get isPluralTranslation method especially:

It's to check the translation object data. For example, if we have:

someTranslation: {
 zero: `There's no transactions',
 one: `There's one transaction',
}

It will check whether someTranslation has a key of one of the plural categories (zero, one, etc.).

I don't think we should, I think count should be in the normal params AND should be automatically passed to the translation (if it is a method), so you can do something like:

The main concern of the count is on the phraseParameters type. Currently, it accepts multiple arguments, https://github.com/Expensify/App/blob/51ef6c1bd1cd3d981db744bff1fcff9e7c670c0c/src/libs/Localize/index.ts#L148

and we can use it like, so it's tricky where to find the count

translate('common.error.characterLimit', {limit: 20}, {age: 2}, 1, true, undefined)

What about we update the parameters type so that the count data is always at the end of the params. https://github.com/Expensify/App/blob/51ef6c1bd1cd3d981db744bff1fcff9e7c670c0c/src/libs/Localize/index.ts#L50

type PluralCount = {count: number};
type PhraseParameters<T> = T extends (...args: infer A) => string ? [...A, PluralCount?] : [PluralCount?];

and we can use it like this:

translate('common.error.characterLimit', {limit: 20}, {age: 2}, true, {count: 1}) <- count is optional
brandonhenry commented 7 months ago

Updated proposal since count is gnarly to find (specifically the Pluralization handling section)

iwiznia commented 7 months ago

@bernhardoj:

and we can use it like, so it's tricky where to find the count translate('common.error.characterLimit', {limit: 20}, {age: 2}, 1, true, undefined)

What?! Are you sure we can do that? And if so why?!? I don't think that's intended, what the fuck would that do? Especially with 1, true, undefined?

If that is true, I think it was not intended at all and probably a mistake when converting it to ts? Yep, it was done here (@BartoszGrajdek can you clarify why please?). Did not check all usages, but a quick look make it seem we don't use it like that anywhere.

@brandonhenry you are close, but your proposal is not satisfying this:

If none is found, log a warn (or throw in dev) and use the translation defined in other

brandonhenry commented 7 months ago

@iwiznia ahh, I updated my proposal so that if we do rely on other and not a fallback translation, then we will log a warning and return phrase[other]

Updates in Pluralization handling section

jayeshmangwani commented 7 months ago

@bernhardoj just wanted to confirm, is this a typo in the below code?

Localize.translate('en', 'characterLimit', undefined, 1) => Are you sure you want to delete this tag?

instead, you mean below code, right ? or am I missing something ?

getTranslatedPhrase('en', 'characterLimit',null, undefined, 1) => Are you sure you want to delete this tag?

bernhardoj commented 7 months ago

instead, you mean below code, right ? or am I missing something ?

emm, nope. If we add count to the param (which we won't do it most likely), then translate will only have 4 params.

https://github.com/Expensify/App/blob/8c6b381e429e9dd2c97284c41aabf5ab62e2d09c/src/libs/Localize/index.ts#L148

What?! Are you sure we can do that? And if so why?!?

Yes, we can do that.

melvin-bot[bot] commented 7 months ago

πŸ“£ @jayeshmangwani πŸŽ‰ An offer has been automatically sent to your Upwork account for the Reviewer role πŸŽ‰ Thanks for contributing to the Expensify app!

Offer link Upwork job

melvin-bot[bot] commented 7 months ago

πŸ“£ @brandonhenry πŸŽ‰ An offer has been automatically sent to your Upwork account for the Contributor role πŸŽ‰ Thanks for contributing to the Expensify app!

Offer link Upwork job Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review πŸ§‘β€πŸ’» Keep in mind: Code of Conduct | Contributing πŸ“–

brandonhenry commented 7 months ago

Thanks! Already started working on this one, so hope to have this around by Wednesday at the latest. Will respond to feedback later today @iwiznia

blazejkustra commented 7 months ago

If that is true, I think it was not intended at all and probably a mistake when converting it to ts? Yep, it was done here (@BartoszGrajdek can you clarify why please?). Did not check all usages, but a quick look make it seem we don't use it like that anywhere.

@iwiznia At the time we migrated it I remember seeing some usages where translate had more than one parameter, that is why it was implemented like this.

If it's really the case that translate takes one parameter max, this makes things only easier and you can go ahead and refactor it as part of this issue, if you need help please let me know! πŸ˜„

brandonhenry commented 7 months ago

@iwiznia responding to those comments today

iwiznia commented 7 months ago

@iwiznia At the time we migrated it I remember seeing some usages where translate had more than one parameter, that is why it was implemented like this.

Which usages? I don't get it because if someone was passing extra params, they were not used in the previous code, so you could've just fixed those usages instead of implementing a feature that is not needed...

blazejkustra commented 7 months ago

I must admit that the migration of Localize was a challenging process, particularly the translate function. My apologies for the complexity introduced by allowing more than one argument. At the time, I believed it would make the translate function more generic and just work for various use cases.

Now I understand we don't need this. Moving forward, I suggest we just refactor the function.

brandonhenry commented 7 months ago

@blazejkustra thanks for the insight and no worries - great to understand your thought process as it will help in the refactor too :) - I'll link you on the PR once I have a suitable solution to see your thoughts as well!

iwiznia commented 7 months ago

Yeah, let's make sure to add @blazejkustra as reviewer of your PR @brandonhenry

jayeshmangwani commented 7 months ago

PR is in Draft; we are waiting for @brandonhenry to make PR ready for review.

brandonhenry commented 7 months ago

Actively working on this. See PR.

Goal is to have this wrapped up by Sunday.

mallenexpensify commented 7 months ago

@brandonhenry per CONTRIBUTING.md,

New contributors are limited to working on one job at a time, however experienced contributors may work on numerous jobs simultaneously.

You currently have 6 issues open with non merged or closed. Please do not submit proposals for new issues until all six PRs have been merged.

Also from CONTRIBUTING.md

Daily updates on weekdays are highly recommended. If you know you won’t be able to provide updates within 48 hours, please comment on the PR or issue stating how long you plan to be out so that we may plan accordingly. We understand everyone needs a little vacation here and there. Any issue that doesn't receive an update for 5 days (including weekend days) may be considered abandoned and the original contract terminated.

You were assigned on March 22nd and are proposing to have a PR wrapped up by April 5th. We expect that PRs are addressed with urgency. If you're unable to have a PR ready for review in the next 48 hours, please comment to have this issue reassigned.

jayeshmangwani commented 7 months ago

@brandonhenry What is ETA for making the PR ready for review? Your last commit was 2 weeks ago. Just a reminder https://github.com/Expensify/App/issues/38614#issuecomment-2030199914

brandonhenry commented 7 months ago

Hey @jayeshmangwani I have been actively working on this. I will push my changes today as this is ready to review

brandonhenry commented 6 months ago

Found a small edge case that caused my changes to crash. Refactored a ton so code is much more simple and also handles the edge case properly (seems to have just slipped by somehow up til now.).

Tested things out and everything is working now. Added example and test cases. Back over to review