WeblateOrg / weblate

Web based localization tool with tight version control integration.
https://weblate.org/
GNU General Public License v3.0
4.48k stars 994 forks source link

xliff : bad weblate key from `xliff/file//trans-unit/@resname` (same resname in two file tag) #4937

Closed apourche closed 3 years ago

apourche commented 3 years ago

Describe the bug

Importing an xliff file results in the determination of Keys by Weblate, associated with each <trans-unit> tag. This determination seems to be based on trans-unit /@resname if it exists, without taking into account the current tag <file> context. The xliff standard does not specify this rule or global uniqueness of @resname. On the other hand, she clearly speaks of the uniqueness of @id in the context of a <file> tag (http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#trans-unit).

So, for @resname, it would make sense to only enforce their uniqueness in the context of a <file>tag.

The determination of Keys by Weblate would become:

key = {file/@original} / {trans-unit/@resname}

To Reproduce the bug

Steps to reproduce the behavior:

  1. Import Initial XLIFF file (see below) as 'translate document' component

  2. Problem : Only one Source string is created for the two <trans-unit> with same @resname in two different <file> tag Confirmation of the problem (bad check) : The component contains several duplicated translation strings. image image

  3. Consequences of this problem: translation and export in another language produces an incomplete file, not processing the strings considered as duplicated (wrongly) : image

Expected behavior

The determination of Keys by Weblate would become:

key = {file/@original} / {trans-unit/@resname}

...or schould be configurable

Screenshots

Initial XLIFF file :

<xliff xmlns="urn:oasis:names:tc:xliff:document:1.2" version="1.2" xmlns:scxlf="extXliff">
    <file original="/rep1/myFile1.xml"  source-language="fr" datatype="plaintext">
        <body>
            <trans-unit id="1" resname="str1">
                <source xml:space="preserve">str1</source>
            </trans-unit>
            <trans-unit id="2" resname="strA">
                <source xml:space="preserve">strA</source>
            </trans-unit>
        </body>
    </file>
    <file original="/rep1/myFile2.xml"  source-language="fr" datatype="plaintext">
        <body>
            <trans-unit id="1" resname="str1">
                <source xml:space="preserve">str1</source>
            </trans-unit>
            <trans-unit id="2" resname="strB">
                <source xml:space="preserve">strB</source>
            </trans-unit>
        </body>
    </file>
</xliff>

Server configuration and status

Thank you for your answer

nijel commented 3 years ago

It was introduced by 6190db4ce64568b7ae03486bb46bdd82095c20b1, see #2218 and #2380. The Xliff specification does not define its uniqueness at all and Weblate currently assumes it's unique in the whole Xliff file. And nobody complained so far :-).

In case we would strictly follow the specification, there are no requirements on resname being unique, so it should not be used at all...

apourche commented 3 years ago

Thank you for your reply @nijel. The limit of standardization, and their interpretation ... :)

http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#resname :

Resource name - Resource name or identifier of a item. For example: the key in the key/value pair in a Java properties file, the ID of a string in a Windows string table, the index value of an entry in a database table, etc.

Without considering the uniqueness of @rename in the <file>, all of these examples are wrong : it becomes impossible to have a single xliff dealing with 10 different java properties file (with potential redundancy of sring).

Hopefully xliff2.0 will be more explicit on this topic :)

We will therefore modify our xliff files to make them compatible with this weblate rule...

Cordially.

ps: I very recently discovered weblate, and did not take long to choose it compared to Pootle. Great job! We will use weblate for the translation of the different components of our tool (libre software) : https://scenari.software/

nijel commented 3 years ago

Without considering the uniqueness of @rename in the , all of these examples are wrong : it becomes impossible to have a single xliff dealing with 10 different java properties file (with potential redundancy of sring).

That's valid point. On the other side the existing implementation is there for two years and changing the behaviour would be a breaking change for existing users.

Hopefully xliff2.0 will be more explicit on this topic :)

It doesn't see to be the case - the attribute name has changed, but the description is pretty much same: http://docs.oasis-open.org/xliff/xliff-core/v2.1/os/xliff-core-v2.1-os.html#name

nijel commented 3 years ago

I don't see big enough reason to change the behaviour now, but this should be at least documented.

github-actions[bot] commented 3 years ago

Thank you for your report, the issue you have reported has just been fixed.