magento / pwa-studio

🛠Development tools to build, optimize and deploy Progressive Web Applications for Magento 2.
https://developer.adobe.com/commerce/pwa-studio/
Open Software License 3.0
1.07k stars 683 forks source link

Venia Localization w/ i18n #669

Closed gil-- closed 4 years ago

gil-- commented 5 years ago

This issue is for the following packages:

This issue is a:

Environment

Question Answer
Magento version
Operating System + version
node.js version (node -v)
npm version (npm -v)

Description

As a store owner and implementer, I want to be able to translation text/strings within the storefront to multiple languages (i18n). Currently, all text such as labels, etc. are hard coded in the React components.

This would also include a language switcher but that could be an enhancement after we get translations working.

Expected result:

Venia comes with multiple languages supported out of box (English, Spanish, Japanese, etc.). An implementer will have a way of translation all strings to different languages.

Possible solutions:

There are a variety of different solutions. https://github.com/yahoo/react-intl is one

vitalics commented 5 years ago

As for me, I used react-i18next since i18n have nice documentation, scale and flexible

brendanfalkowski commented 5 years ago

Would like to see some direction on this as well. We've been backlogging the question internally, but the business is pushing for direction as we're wiring up more code without localization.

codeAdrian commented 5 years ago

Agreed with @brendanfalkowski . My question is whether store-view switching can also be implemented at this point? So not only the language preference, but also the rest of Magento store view configs?

jimbo commented 5 years ago

@vitalics @brendanfalkowski Thanks for the library suggestions. This issue has come up a lot recently, so I'm going to start researching the translation landscape again. Ultimately, I'd like to come up with a spike or story for our team (or a community member) to work on.

Basic approach

Depending on how the libraries out there work, we may not need any of them. The basics of translation should follow a familiar pattern:

// one way of doing things

const textMap = new Map()
  .set("jp", <span>日本語</span>)
  .set("en", <span>English</span>)

const Foo = props => {
  const lang = useLanguageContext()
  const text = textMap().get(lang)

  return <button>{text}</button>
}

Open questions

Here a few of the architectural choices we'll need to make, though:

1. Should we support elements or only strings?

Elements would extremely powerful for rendering different things for different languages, but wouldn't be serializable to JSON.

2. If we support only strings, should we store them as JSON or JavaScript?

Retrieving strings as JSON would allow us to pull translated values from the database, but if they're stored in the repository there's not much advantage to JSON. Meanwhile, storing them in the repository, as code, opens up a lot of new possibilities, including elements as values.

3. Since we'll have to load a language's values dynamically, should we store and load those values individually (on a per-component basis) or as a batch (on a per-language basis)?

Loading items individually would minimize and distribute payloads, but it would make many parts of the site render asynchronously. Loading items as a batch would optimize rendering, but it would result in overserving, and centralizing translation could be complex.

fooman commented 5 years ago

Thanks for the update @jimbo. One of the things that I'd like to see included in the open questions is how do we make sure that all the previous translations efforts that have gone into Magento can be benefitted from (for example here)

niklaswolf commented 4 years ago

Just my two cents from our project: We built a really simple translation system ourself using Context and a useTranslation-Hook which is giving access to the translation-API. We started only with plain text translations, but there are some use cases where you want to replace some variables with components.

Just as an example "I accept the %terms% and confirm that I have read and understood the %cancellation% and the %privacy%." where you want to replace the placeholders with Link Components to link to the corresponding site. So I think this should be kept in mind when making the decision.

I'd not allow to use the translation for complete component replacement, I think this could be done by exposing a LocaleContext or StoreConfigContext or something like that, which then can be consumed and used seperately (also for currency etc.) to switch components by locale.

For the library it could make sense to store the translations on component basis. In our storefront, we don't do this, but it also is no library so it doesn't get used by other applications. Like this components would be really easy to pull in other applications, but it also would be very tricky to use existing translations like @fooman noted.

chris-brabender commented 4 years ago

Currently working on a solution for this, should have something to demo in the next week or two. Fetches translated phrases from Magento backend at build, generates translation json files and and uses i18next for the translation engine.

sirugh commented 4 years ago

@chris-brabender interesting - how does it serve the languages for runtime?

niklaswolf commented 4 years ago

Just want to add, that as the developer of the PWA I don't want to have to edit Magento translation files just to add some custom translations for example. If it would be like this I'd have to setup a development instance just to add translation strings, this would make no sense in my opinion!

vasilii-b commented 4 years ago

@niklas-wolf wouldn't you need a development instance to grab other data like products, categories, system configs, etc. etc.? 🤔

vasilii-b commented 4 years ago

btw, may I give it a try? Which solution should we take here? https://github.com/i18next/react-i18next? Thank you!

niklaswolf commented 4 years ago

@niklas-wolf wouldn't you need a development instance to grab other data like products, categories, system configs, etc. etc.? 🤔

Yes but this can be managed by backend devs. As a frontend dev it would be very annoying to have to touch the backend instance just for translations. I see why it’s handy to use Magento translations in some way, as there is much that already exists. But I think there should at least be an option to add translation strings on the frontend side.

vasilii-b commented 4 years ago

I think I see your point, @niklas-wolf. You want to be able to adjust, add new translations from the custom PWA theme, let's say. Whereas the main translations come from the backend (as an option). Did I get it right?

LucasCalazans commented 4 years ago

I agree with @niklaswolf. I think that the phrases that are in the components should be translated by PWA Studio and those who came from Magento should be translated by the modules with the store language, with that Magento will translate it on the APIs using the default "__" method.

Jordaneisenburger commented 4 years ago

@chris-brabender, that sounds promising! Would this be something you could show on fridays community sync ?

vasilii-b commented 4 years ago

@Jordaneisenburger, @LucasCalazans what if the theme (pwa) wants to adjust the translation that comes from the Magento? How this can be covered?

Jordaneisenburger commented 4 years ago

So I haven't seen @chris-brabender 's code but I assume there would be some sort of deep merge

LucasCalazans commented 4 years ago

@chris-brabender any news about your implementation? I'm available to contribute to this issue.

Also, about the libraries, what do you guys think is best to use?

https://github.com/i18next/react-i18next https://github.com/formatjs/react-intl

In my opinion, i18next/react-i18next may be better because it supports translations with tags (https://react.i18next.com/guides/the-drawbacks-of-other-i18n-solutions#can-you-translate-combined-jsx-nodes-in-one-sentence)

LucasCalazans commented 4 years ago

@vasilii-b Maybe it could be done by a response code.

Normally we have a code inside the response body that says what is the message.

chris-brabender commented 4 years ago

Hey all here is a preview of where I am at right now with the implementation.

General concept: Translation files

Screen Shot 2020-03-20 at 4 13 09 PM

GraphQL

Localization hook

Router

Code Link to code (still WIP)

Video Walkthrough YouTube

Notes:

chris-brabender commented 4 years ago

Status update

Process overview

  1. import useLocalization hook from @magento/peregerine to all components with a phrase to be translated
  2. Wrap phrases in _t() function
  3. Execute yarn run fetch:translations to extract phrase and fetch any available translations from the magento backend
  4. Create a core.json file in venia-concept locales for any phrases that may be missing (some examples included for locales folder structure)
  5. Build and enjoy localised content
davemacaulay commented 4 years ago

Huge shoutout to @chris-brabender for providing the PR to help our initial investigation. This gave us a huge step up and allowed my team to truly understand how a solution can be built.

A subsequent shoutout to my team members @anthoula and @eug123 for working on this discovery work and helping define the future of this feature set in PWA Studio.

After a research effort including polls, discussions and online discovery we formulated a high level plan to execute upon to deliver what we believe as the simplest and best solution to provide localization as a first party feature to PWA Studio. 🥳

We had a few basic criteria when we embarked on this journey and these set the basis of our decision making:

We intend to build this feature set in an iterative approach, so you will over time see new aspects of our vision created. For instance the initial iteration of this functionality may not include a feature you'd expect, but should still provide some value.

With the above in mind we embarked on answering what we saw as important questions first:

Which localization React library should we use?

There are a few good contenders in this field, i18next and react-intl provide the most complete offerings, with a few new comers and alternatives also in the mix (such as LinguiJS). As we required a stable well supported library all of the new comers quickly fell off our list of potential libraries.

We performed various bundle, performance and developer experience tests on both libraries and opted to use react-intl, the items which elevated this above i18next were:

How to retrieve translation strings to be consumed by the chosen library?

The next large problem we wanted to address was how to collect all the required language packs to provide to the localization library in order to actually localize the content.

There were 2 clear leaders in this area, the first as @chris-brabender had implemented in his amazing PR, to collect the strings during build and then request the relevant translations via a GraphQL request to Magento. The alternative was to store the language packs locally within the scope of the PWA Studio app (within npm packages, within the scaffolded application etc.).

After a community poll (all be it missing enough context, so maybe not worthwhile mentioning), internal review and a discussion between interested parties inside of Magento we opted to lean on our "simple and quick" requirement and opt to store the language packs locally within the app. Furthermore we choose this option due to:

As part of choosing this solution we did highlight a few points that are integral when planning, grooming & implementing this feature set:

As part of this strategy we opted to use PWA Studio's specialFeatures flag for a package to specify if it contains translations or not, during the build process we will navigate to all packages and locations which contain translations and merge them into a final version for consumption by the library.

Conflicting translation keys will serve as overrides based on the BuildBus order of those packages, any translation in the scaffolded src directory will have the highest priority enabling overrides throughout the whole app.

How much of multi store will be required to make the dream work?

When dealing with localization in Magento you inevitably start touching aspects of multi store. We have a future epic of work to complete a full multi store experience but wanted to understand how much of this feature was required purely for localization.

From our investigation we discovered that we would need to build some of the basic concepts, notably the store view switcher and currency switcher from Luma. For the localization work we will aim for parity against the same feature sets as Luma. We will also include the functionality to include store code prefixes in URLs if the store has this configured. However, we will not include the footer store switcher as part of the localization work, as we deem this purely a feature of multi store.

User locale detection

We intend to include a method of directing a shopper to a specific store and locale based on their Accept-Language header and potentially other mechanisms, we have yet to fully investigate and expand on this feature but shall add more details in time.

If you didn't find the Harry Potter reference half way through this post go find it for 5 house points!

larsroettig commented 4 years ago

@davemacaulay but there is no PR you already something that we can review?

davemacaulay commented 4 years ago

@larsroettig we do research upfront to determine how to execute upon a feature of work, these are the findings from our initial research spikes which will help direct us in our implementation.

There will be a number of small iterative PRs opening over the coming weeks to get us to this end goal.

fooman commented 4 years ago

First such PR https://github.com/magento/pwa-studio/pull/2643

davemacaulay commented 4 years ago

We have started our implementation of react-intl, part of it's API requires an ID within the code to identify which string you wish to present on the page in the users designated language. We've identified the need to define the format of these ID's and we've formed 2 potential solutions.

Use en_US version of string

The first option aligns with the translation implementation of Magento Core, and looks to utilize the English version of the string as the ID.

Component:

<FormattedMessage id={'Translated string'} defaultMessage={'Translated string'} />

en_US.json

{
    "Translated string": "Translated string"
}

Pros:

Cons

Generate ID's based on the component purpose / context

This option uses a string ID to identify which string we want to present in the UI. This would be composed by the component name and purpose of the string.

Component:

<FormattedMessage id={'component.translatedString'} defaultMessage={'Translated string'} />

en_US.json

{
    "component.translatedString": "Translated string"
}

Pros:

Cons:

fooman commented 4 years ago

I'd like to add some considerations:

Pro using IDs:

in the past developers where not aware that the initial en_US translation doubled up as the lookup for translations. This lead to the following scenario:

initial string: "This is an en_US string", "This is an en_US string"

translations happen "This is an en_US string", "Dies ist ein de_DE Satz " [...]

Now someone comes along and doesn't like "This is an en_US string". The correct approach to changing it would be

"This is an en_US string", "This is an improved en_US string"

what did happen was

"This is an improved en_US string", "This is an improved en_US string"

breaking all existing translations.

Con using IDs:

Existing community tooling and existing translations would be harder to re-use. Ideally PWA Studio would re-use as much of the already existing translations provided by the Magento community as to not require another huge undertaking for the community. As a mitigation for this Con I am hoping some extra tooling can be created that translates "This is an en_US string", "This is an en_US string" to "component.translatedString", "This is an en_US string" - it won't be context aware but may be good enough for a first iteration.

brendanfalkowski commented 4 years ago

My two cents. Lessons from managing ~5000 translations across 3 languages for my most complicated project.

  1. We never used the "language packs" provided by Magento because we had so much custom language.

  2. Due to (1), I personally find it far simpler to manage the "default presented language" directly in code by editing code (easy to "find all" in project). We never had a goal to avoid touching strings in code so that OOB translations could be used to do that too. Too much abstraction and not enough benefit (for us).

  3. We did use "en_root to en_us/en_ca" translations heavily to avoid template edits. Ex: "Select region, state, province" to "Select state"/"Select province".

  4. We did constantly have trouble where "English to English" translations were utilized, because translators would translate the origin phrase (not the localized phrase). Ex "en_root ➔ fr_ca" not "en_ca ➔ fr_ca". It was possible to manipulate our translation management app to show this, but our translators didn't do it consistently.

  5. We never used the "context" feature: Identical strings in different contexts would be translated to the same string. All translations were simple phrases/sentences and we found they could always be global.

  6. We have strings coming from third-party APIs, and we do need to translate it. We could "slugify" this into an ID format but it needs to be clientside.

  7. Flipside of (6), we have strings coming from first-party APIs, and we can translate them pre-delivery (we do this in our PIM too). So there are cases the "root string" wouldn't need to be translated at all.

  8. Sometimes a developer just changes a string, and doesn't change the string in our translation management tool. We never found a good way to automatically keep these in sync / spot issues.

My gut says an ID-based system would be more resilient for us because of (4) and (8). I'm not sure if making it component-based (not global) would be an improvement or not.

sirugh commented 4 years ago

Identical strings in different contexts would be translated to the same string.

Not sure this is a con - we may want to start with global strings and only specify context when necessary.

"global.greeting": "Welcome to Foo Store!"
"signIn.greeting": "Welcome to Foo Store, {username}!"

Ideally PWA Studio would re-use as much of the already existing translations provided by the Magento community as to not require another huge undertaking for the community.

@fooman I understand the concern here but as far as I know we haven't been using existing translations/strings for the UI so any matches would be coincidence. If we use the english string key pattern or the context.id pattern there are still likely many new strings that won't have matches and will need to be translated.

That said, it may be trivial to implement a tool that generates some initial translation files for PWA Studio based purely on matches of string values. The following assumes we use the context.id pattern but would work either way.

  1. New message in PWA Studio's i18n/en_US.json:
"myComponent.message": "This is an en_US string"
  1. Find key in legacy (default locale?) translation by matching on value:
// legacy en_US translation
"legacy key string": "This is an en_US string",
  1. Looks up all translated matches for key:
// legacy de_DE translation
"legacy key string": "Dies ist ein de_DE Satz"
  1. Creates PWA Studio's i18n/de_DE.json:
"myComponent.message": "Dies ist ein de_DE Satz"
supernova-at commented 4 years ago

For developers whose primary language is English the english string key pattern seems like a way to simplify things but I think it only adds confusion to an already complicated system.

It is interesting to hear that translations rarely need to be context-aware. I definitely considered this a con. In that case, I like the

"global.greeting": "Welcome to Foo Store!"
"signIn.greeting": "Welcome to Foo Store, {username}!"

idea, but unfortunately I'd expect the subtle context differences needed in translation to be only uncovered at the time of translation. In other words, after a developer would have needed to decide whether to create a separate key or not.

As a developer, the context.id pattern fits nicely into my mental model of a "lookup".

davemacaulay commented 4 years ago

From internal discussions / comments along with @fooman and @brendanfalkowski comments we're going to move ahead with using ID's based on the component name then the description of the string.

If anyone has a big reasoning as to why we shouldn't do this we can alter path with good enough reasoning.

fooman commented 4 years ago

@fooman I understand the concern here but as far as I know we haven't been using existing translations/strings for the UI so any matches would be coincidence. If we use the english string key pattern or the context.id pattern there are still likely many new strings that won't have matches and will need to be translated.

I guess this goes at the heart of my request. Is there any way to make this less of a coincidence? So after the process you outlined to find if we can translate via a coincidental legacy string can we include another pass to see if we really must introduce a new string.

Practical example here https://github.com/magento/pwa-studio/pull/2721/files#diff-8d2cdbfbfb3f2a21b78dd839ac9f59d1R82 if we used "You have placed no orders." instead of "You don't have any orders yet." we get translations for "free".

If there really is a must for Venia to use "You have placed no orders." I would suggest adding "You don't have any orders yet." in venia-ui and supply the "You don't have any orders yet." translation in venia-concept. With the venia-ui strings being used to perform the legacy lookups/matching.

sirugh commented 4 years ago

Much of the pwa copy is just UI direction or our own best guess. I don't see why we can't attempt to reuse existing copy. However the system were opting for is context aware whereas existing copy may not be, at least not to PWA Studio. This could present a challenge to reuse. Perhaps we should open an issue to suss out some tooling and explore our options. Just need some existing translation files and time :)

fooman commented 4 years ago

Existing translations are here https://crowdin.com/project/magento-2 or better yet the final csv output files can be found here https://github.com/magento-l10n?utf8=%E2%9C%93&q=%5Elanguage-

https://github.com/magento-l10n/language-nl_NL/blob/master/nl_NL.csv for example is showing as 99% translated.

brendanfalkowski commented 4 years ago

@davemacaulay Just a suggestion, because I'm going to type this 10,000 times by next year. The component/prop naming is a handful.

<FormattedMessage id={'Translated string'} defaultMessage={'Translated string'} />

would be better as:

<T id="global.greeting" t="Hello world" />

Would also accept <Tran> or <Translate>. Just trying to save a few thousand keystrokes.

davemacaulay commented 4 years ago

@brendanfalkowski we opted to not abstract or alias the functions so the react-intl documentation would hold true: https://formatjs.io/docs/getting-started/installation

You can certainly create a wrapping component or import with an alias to reduce your keystrokes, but we believe holding true to the original frameworks API will assist developers in their day to day and make it much easier to discover solutions to problems using our favorite developer resource, Google!

brendanfalkowski commented 4 years ago

@davemacaulay I understood that reference: gripes go up the chain of command.

https://www.youtube.com/watch?v=dKbdE5LOGNQ

I should be complaining to react-intl :) Making that wrapper now...

chris-brabender commented 4 years ago

I am not entirely sold on the id's implementation to be honest. If, for example, I want to update all Apply buttons to french, I need to do it 100 times in my translation file for each component? Eg:

<FormattedMessage id={'addressBook.applyButtonText'} defaultMessage={'Apply'} />

One one side it means I can target things specifically but creates a lot of overhead doesn't it?

Also means we can't re-use a lot of the translation files we already have without a lot of work to update them to the id based identifier.

sirugh commented 4 years ago

If, for example, I want to update all Apply buttons to french, I need to do it 100 times in my translation file for each component?

Also means we can't re-use a lot of the translation files we already have without a lot of work to update them to the id based identifier.

@chris-brabender I spent a little time on this script that can scan over whatever files you need to generate new translation json files for you assuming there are hits in the legacy files. It also can show you what keys/strings did not have matches.

I was hoping to take a lot of the manual work out of the process, and as @fooman mentioned we could go further with it and do some sort of fuzzy search for the missed strings to see if there is existing legacy copy that we can use in place of the string in pwa studio.

davemacaulay commented 4 years ago

@chris-brabender I think for known common strings we just wouldn't namespace them. So things like "Apply", "Default", "Save" could have generic global ID's like apply, default & save.

brendanfalkowski commented 4 years ago

@chris-brabender @davemacaulay I'd suggest taking this one step further:

I think for known common strings we just wouldn't namespace them. So things like "Apply", "Default", "Save" could have generic global ID's like apply, default & save.

Make everything global by default, and only introduce namespaces if you actually find conflicts in usage. That'll help with maximum re-use and won't step on extensions who choose to namespace. I'd also personally hate seeing cross-component imports like XXX incorporates YYY.someSpecificString simply because that component already has it.

tjwiebell commented 4 years ago

Most design is completed and remaining features will be delivered very soon. Closing in favor of those PRs and internal issues.

phtmgt commented 3 years ago

Has someone taken on replacing strings with in all jsx files as well as producing a matching en_EN.json? I can take on that, if it’s not already done.

sirugh commented 3 years ago

Has someone taken on replacing strings with in all jsx files as well as producing a matching en_EN.json?

That work should have been done, but it is possible we missed some. Would certainly welcome any pull requests for strings we missed!