Feature/i18next scanner

JohnRDOrazio commented 11 months ago

Attempt to implement i18next-scanner.

As of the latest commits in the PR, there is just one minor detail to try to work out (which can be done even after the PR is merged):

While the configuration option defaultValue works fine for t() functions, however for Trans components the string value in English is being set in all language files, whether English or not. Instead it would be desirable for an empty value to be set in non English language files, while the source string is correctly set in the English language file. This perhaps needs to be handled within the customTransform function?

(I haven't tested i18next-parser, but it seems to suffer the same limitation, see https://github.com/i18next/i18next-parser/issues/206)

netlify[bot] commented 11 months ago

Deploy Preview for confessit-web ready!

Name	Link
Latest commit	3b4aa5d6e9c4efc7404bec2910bb362450faabc4
Latest deploy log	https://app.netlify.com/sites/confessit-web/deploys/64713238a51f79000897eeb6
Deploy Preview	https://deploy-preview-60--confessit-web.netlify.app/
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

kas-catholic commented 11 months ago

Thanks for implementing some of this! I'll take a closer look at this soon - hopefully by this weekend. I'd like to better understand what it does and see if we can integrate it with GitHub Actions.

kas-catholic commented 11 months ago

I'm a little skeptical of i18next-scanner because it seems like a lot of complexity, I don't want to have to remember how all this works and deal with it a year from now next time something breaks or I upgrade something.

IMO the best solution would be the simplest thing that can just fail a check on a PR if a new translation string is introduced that isn't in en/translation.json. Seems like perhaps i18next-parser can do what I want with failOnUpdate: true, but I haven't looked in detail yet.

JohnRDOrazio commented 11 months ago

I'm a little skeptical of i18next-scanner because it seems like a lot of complexity, I don't want to have to remember how all this works and deal with it a year from now next time something breaks or I upgrade something.

IMO the best solution would be the simplest thing that can just fail a check on a PR if a new translation string is introduced that isn't in en/translation.json. Seems like perhaps i18next-parser can do what I want with failOnUpdate: true, but I haven't looked in detail yet.

Rather than complexity, I would say that i18next-scanner offers a great comodity: it will ensure that the source translation file is: 1) COMPLETE (I can tell you that when running it for the first time, it picked up at least one translation key that was missing!) 2) COHERENT (it will reflect what is actually in the source code, no more no less)

I have a hunch you don't yet have a clear idea of the localization workflow, but that comes with actually using weblate and getting a feel for the workflow. i18next-scanner will actually make for a better and more efficient workflow.

Here is the outline of the general workflow:

FIRST EXTRACTION

1) Source files are prepared for i18n by implementing translation functions / components 2) Translation keys are extracted from the source files into the source translation file (usually English): an automated process such as i18next-scanner will ALWAYS be better than a manual process which is prone to human error 3) Translation keys are automatically populated into other translation files (this doesn't necessarily have to be handled by the extraction tool, say i18next-scanner; this can be handled quite nicely by the translation tool, let's take for granted Weblate)

SOURCE FILES CHANGED

1) Source files are added / edited / removed, there are new or defunct translation keys or source translation strings have changed 2) These changes will need to be reflected in the source translation file (English): an automated process such as i18next-scanner will perfectly reflect these changes 3) If the extraction tool (i18next-scanner) hasn't already added new keys or removed old keys from other translation files, the translation tool (Weblate) will update other translation files with the new keys, or remove defunct keys, and will mark as dirty those translations where the source string has changed 4) Translators will be invited to translate new strings and fix dirty translations

Taking into account this workflow, when a source translation string changes, you don't have to worry about other translation files not corresponding with it just yet. This does not need to be handled in the same PR. This needs to be handled in Weblate. Once translators will have updated their translations, such that there are no more dirty translations, then the translation files will again reflect the source translation. And that will happen in a future PR from Weblate.

I have a hunch you're not very trusting in a community spirit here. I get that you want to have everything under control, since you created the project initially. But I think we need to learn to rely on each other: maybe not everyone will remember how everything works, but if everyone takes responsibility for the part that they are contributing, then one year down the road when you say "how did that i18next-scanner tool work again?", you can always say "oh was it John that contributed that, maybe we can ask him". That's where a project becomes a community project.

I've been using i18next-scanner in other projects, and it works very much like i18n-tasks in Ruby on Rails projects. Once you start using them, especially in a project that's evolving, you start to realize how efficient they are and how they help streamline the process of localization and translation. And you find pretty much the same streamlined process whether in Node or in Ruby on Rails or perhaps even in other platforms.

I have had no trouble at all in using i18next-scanner in other React app projects, however I was not using the Trans component in those other projects. So in this case, I just had to understand how the i18next-scanner tool works with the Trans component. I pretty much have it tamed at this point: the html nodes are kept correctly instead of producing indexed nodes. The only thing that may need a little more looking into, is how to prevent the source string in English from being populated to other language translation files, but this is not a defining aspect. I'm sure I'll eventually find a way to prevent this from happening, but if it does happen, it's not the end of the world: translators will simply find an English translation that needs to be substituted with an actual translation. Or a PR with new translation strings will just need to manually remove the English string from other translation files, no big deal.

JohnRDOrazio commented 11 months ago

Specifically, the keys welcome.body and help.step_1 were missing from all translation files. I discovered this by usage of the i18next-scanner tool (I used it from the command line, all you need to do is issue i18next-scanner once the package is installed and the config file is in place, see commit 1e6ed0f). This is where an automated tool such as this comes in really handy.

JohnRDOrazio commented 11 months ago

https://github.com/kas-catholic/confessit-web/pull/60#discussion_r1208679546

I think I see what you mean about the indexed tag being produced by the i18next-scanner tool: that's because the Trans component wasn't setup correctly. This is fixed in PR #64 , which is why I would suggest to first pull in PR #64 . Then this feature/i18next-scanner branch would need to be rebased with the latest main branch, and then it can be correctly implemented and pulled.

kas-catholic / confessit-web