daniel-sc / ng-extract-i18n-merge

Extract and merge i18n xliff translation files for angular projects.
MIT License
170 stars 18 forks source link

Interpolations are not normalized #98

Closed mbey-mw closed 7 months ago

mbey-mw commented 7 months ago

Describe the bug Source of the problem can be seen after saving a translation with lokalize (23.08.1) where the following happens:

       <trans-unit id="ExampleId" datatype="html">
-        <source>ID: <x equiv-text="{{nextId | async}}" id="INTERPOLATION"/></source>
-        <target state="signed-off">ID: <x equiv-text="{{nextId | async}}" id="INTERPOLATION"/></target>
+        <source>ID: <x id="INTERPOLATION" equiv-text="{{nextId | async}}"/></source>
+        <target state="signed-off">ID: <x id="INTERPOLATION" equiv-text="{{nextId | async}}"/></target>
       </trans-unit>

Interpolations in source/target are not normalized when converted to strings which will be compared. This can result in reset state of the translated targets or other constant changes when modifying translations.

Expected behavior The order of attributes in the x tags makes no change to the comparison result. The normalization process could simply sort the attributes.

Version (please complete the following information):

daniel-sc commented 7 months ago

@mbey-mw thanks for this issue and the PR! There are two points I am not sure about - maybe you already thought about it:

  1. Does the attribut order actually change? If so, when?
  2. The proposed solution will update all existing attributes, leading to huge changes for all users - would it be possible to change only the actual comparison - e.g. in Merger.normalize
mbey-mw commented 7 months ago
  1. Order of the Attributes

    The order of the attributes changes in our translated files when they are saved. lokalize does it as described in the bug. Maybe other tools do this as well - in the end it shouldn't make a difference which tool you use.

    In our workflow, we simply run ng-extract-i18n-merge before submitting the translations, and the linter checks this. Only normalized files should be stored in the repository.

  2. Breaking Change?

    When comparing two XML nodes, the order of the attributes can change without the "value" of the node changing - in contrast to the order of the sub-nodes. This is the technical challenge when comparing sources and targets. If the XML file is viewed as a text file, a comparison cannot adequately resolve this. It is therefore necessary to always store the same normalized version of the files in Git - which is why Merger.normalize does not seem to be the right place.

    It is also about the content of the source and target tags that is generated when the files are read in. This is regarded as a string and not as a collection of child elements - this would entail a major modification of the software.

    Since the problem of missing normalization has not yet been reported as such, the attribute order of the source file (messages.xlf) seems to be mostly appropriate. In order to minimize changes to this assumed normal case, you could specify an order for certain attributes and only sort the others alphabetically.

    In our XLIFF 1.2 files, there are only x tags where attributes are generated in the following order: ctype, equiv-text, id. In the tests there are ph tags with attributes id, equiv, disp. This could be used to specify a sequence. Should this be a configurable setting? What default value should it have?

    [
     "ctype",
     "equiv-text",
     "id",
     "equiv",
     "disp"
    ]

    An empty list would work for our case, where the attributes are already generated in alphabetical order.

mbey-mw commented 7 months ago

Alternatively, an option to switch on attribute sorting could also be introduced. How would that be? Default value would be false / off. -> sortNestedTagAttributes