daniel-sc / ng-extract-i18n-merge

Extract and merge i18n xliff translation files for angular projects.
MIT License
170 stars 18 forks source link

Changes to Source strings are overwritten and ng-extract-i18n-merge reverts translations #75

Closed FrostKiwi closed 11 months ago

FrostKiwi commented 1 year ago

Describe the bug We use Weblate to translate our Angular WebApp. Translators find typos in the English sourceFile and correct them and they show up corrected in the /en localization. In the English sourceFile .xlf file is looks like this: Before correction

...
      <trans-unit id="UploadExperimentRecords" datatype="html">
        <source>Upload Experiment Records</source>
        <context-group purpose="location">
          <context context-type="sourcefile">src/app/app-routing.module.ts</context>
          <context context-type="linenumber">26</context>
        </context-group>
        <note priority="1" from="description">header</note>
      </trans-unit>
...

After correction:

...
      <trans-unit id="UploadExperimentRecords" datatype="html" xml:space="preserve" approved="no">
        <source>Upload Deposition Records</source>
        <target state="translated">Upload Deposition Records</target>
        <context-group purpose="location">
          <context context-type="sourcefile">src/app/app-routing.module.ts</context>
          <context context-type="linenumber">26</context>
        </context-group>
        <note priority="1" from="description">header</note>
      </trans-unit>
...

Here is how it looks in Weblate: image

Here comes the problem. As soon ng-extract-i18n-merge does its thing, it misinterprets those Source string changes and deletes the translator's change. image

The id is a unique identifier, so ng-extract-i18n-merge should theoretically be able to understand whats happening here, but it unfortunately does not. Committing this merge kicks of tons of warnings in Weblate and requires a look-up in the translation history and a revert to the previous state.

It gets worse for the translated files. Here in Japanese version, the source string change is reverted and ng-extract-i18n-merge thinks this is a new, untranslated string, changing its state from translated to new image

Version (please complete the following information):

daniel-sc commented 1 year ago

Hi @FrostKiwi

as this library relies on the angular extract-i18n builder, it is not easily possible to write back updated source texts to the html/ts source code. So, currently the developer (or whoever is merging the translated files back into the repository) should notice these changes and update the source code accordingly.

I agree, it would be helpful to have a way for translators to change the source language texts.

Would the following approach solve your issue?

  1. The translator changes or inserts a target text in the sourceLanguageTargetFile (but keeps the source text unchanged, i.e. handles the "en" file like a translation from en to en.)
  2. ng-extract-i18n-merge does not overwrite this changed target, so that this changed text is used when building with this language file. (Note that normally ng serve does not use language files, hence there the original/unchanged text is still used!)
  3. A info/warning is logged during the i18n extraction, hinting the developer to update the source code from the original text to the changed target.
  4. After the developer changed the source code to match the updated target text, the other target language files will be updated by the next run of ng-extract-i18n-merge and keep their existing translation in their original state (so the state does not change to new for this specific case).

Note: This is not implemented yet, as I'd rather have a confirmation beforehand, that it actually solves the problem at hand.

FrostKiwi commented 1 year ago

This sounds great! Definitely better than the manual patch work required now.

  1. The translator changes or inserts a target text in the sourceLanguageTargetFile (but keeps the source text unchanged, i.e. handles the "en" file like a translation from en to en.)

So, leave <source> untouched, and only change <target> in the sourceLanguage messages.xlf? As my understanding goes, that would require a change to how Weblate works, since changes to source string change <source>. Don't see anything in the documentation to change this. Couldn't all this be handled by looking at just the id alone? If this point needs happen, then I'll ask the Weblate guys. Beyond that, the translator translates in a translator interface like weblate, such technical details will never concern them. They are not supposed to know that an .xlf file even exists.

  1. ng-extract-i18n-merge does not overwrite this changed target, so that this changed text is used when building with this language file. (Note that normally ng serve does not use language files, hence there the original/unchanged text is still used!)

True, which is why we changed our angular.json to treat the default en as a localization like any other. So for use, it does the replacement automatically when building with ng build --localize or when testing with ng serve --configuration=en

...
            "i18n": {
                "locales": {
                    "en": "src/locale/messages.xlf",
                    "de": "src/locale/messages.de.xlf",
                    "ja": "src/locale/messages.ja.xlf",
                    "ru": "src/locale/messages.ru.xlf"
                }
...
                        "en": {
                            "localize": [
                                "en"
                            ]
                        },
...

As for the rest of the workflow, I think this is a really good workflow and would be happy if it was possible.

daniel-sc commented 1 year ago

@FrostKiwi thanks for the fast answer!

There is one little thing with your setup, which probably resolves the issue with Weblate: Instead of using messages.xlf for locale "en", you could introduce a messages.en.xlf for this and use the setting sourceLanguageTargetFile: "messages.en.xlf". This would make messages.en.xlf a translation file like the ones for "de" etc. with the exception that targets are automatically created/synced with changed source code (if not omitted via config, and the target was not changed by the translator). The messages.xlf then would only contain the original texts from the source code and could be ignored by Weblate.

Would that make sense/work?

daniel-sc commented 12 months ago

@FrostKiwi Ping :)

FrostKiwi commented 11 months ago

@FrostKiwi thanks for the fast answer!

There is one little thing with your setup, which probably resolves the issue with Weblate: Instead of using messages.xlf for locale "en", you could introduce a messages.en.xlf for this and use the setting sourceLanguageTargetFile: "messages.en.xlf". This would make messages.en.xlf a translation file like the ones for "de" etc. with the exception that targets are automatically created/synced with changed source code (if not omitted via config, and the target was not changed by the translator). The messages.xlf then would only contain the original texts from the source code and could be ignored by Weblate.

Would that make sense/work?

Sounds good, I'll try setting that up <3

FrostKiwi commented 11 months ago

@daniel-sc

Instead of using messages.xlf for locale "en", you could introduce a messages.en.xlf for this and use the setting sourceLanguageTargetFile: "messages.en.xlf". This would make messages.en.xlf a translation file like the ones for "de" etc. with the exception that targets are automatically created/synced with changed source code

I implemented this:

                "extract-i18n": {
                    "builder": "ng-extract-i18n-merge:ng-extract-i18n-merge",
                    "options": {
                        "browserTarget": "<redacted>:build",
                        "outputPath": "src/locale",
                        "targetFiles": [
                            "messages.en.xlf",
                            "messages.ja.xlf",
                            "messages.de.xlf",
                            "messages.ru.xlf"
                        ],
                        "sourceFile": "messages.xlf",
                        "sourceLanguageTargetFile": "messages.en.xlf",
                        "includeContext": true,
                        "newTranslationTargetsBlank": "omit",
                        "collapseWhitespace": true,
                        "trim": true
                    }
                },

Seems to work, though we didn't have English -> English corrections yet. Porting the English -> English translations back to the source code was required, so i18n-extract wasn't confused.

Weblate throws this warning:

Duplicated translation.

The component contains translation file for the source language. Please consider the following:

The following occurrences were found: Language Language codes File names
English en, en frontend/src/locale/messages.en.xlf, frontend/src/locale/messages.xlf

Seemingly there are two solutions.

What is the correct step here to go with our proposed workflow? From my understanding it should be changing messages.xlf to English (Developer). Is my understanding correct?

daniel-sc commented 11 months ago

@FrostKiwi Don't know about weblate - but essentially messages.xlf should be ignored for translation purposes.

daniel-sc commented 11 months ago

@FrostKiwi finally implemented it - would you mind trying the alpha version 2.8.0-0 and giving feedback if it works as expected?

FrostKiwi commented 11 months ago

@FrostKiwi finally implemented it - would you mind trying the alpha version 2.8.0-0 and giving feedback if it works as expected?

Back at work in 10 days, looking forward to testing this 👍

FrostKiwi commented 11 months ago

Seems to have worked fine! In one instance it decided to revert the translated state to a new state based on the source string changing. But maybe that was just the switch to the two messages.xlf and messages.en.xlf being made after this translation was performed. image but that was the only thing for now. I'll see how it handles translations of source string, once new translations arrive