php-gettext / Gettext

PHP library to collect and manipulate gettext (.po, .mo, .php, .json, etc)
MIT License
687 stars 134 forks source link

Add support for XLIFF unit IDs #221

Closed asmecher closed 4 years ago

asmecher commented 5 years ago

This is the "least-invasive" way to add unit IDs to the existing toolset, without potentially introducing problems with uniqueness (XLIFF doesn't require uniqueness of IDs in any way).

oscarotero commented 5 years ago

Hi, thanks for your contribution.

XLIFF doesn't require uniqueness of IDs in any way

Does it means a translator can returns multiple translations for the same id? What is the point?

asmecher commented 5 years ago

What is the point?

:) I agree -- I like my IDs unique. But check in 4.3.1.21 at http://docs.oasis-open.org/xliff/xliff-core/v2.0/os/xliff-core-v2.0-os.html#d0e4784:

When used in elements:

The value MUST be unique among all id attribute values within the enclosing element.

So by my read, other non-<unit> elements can have the same ID as a <unit> element without problems (no issue for the Gettext library, I don't think), but also multiple <unit> elements can share the same ID as long as they appear in different <file> containers (potentially in the same XLIFF file).

oscarotero commented 5 years ago

In that case, I guess a <file> is the equivalent of a domain in gettext 🤔 I have to think more about this but I'm on vacation right now 🍹

Please, give some time and I let you know. Thanks!

asmecher commented 5 years ago

In that case, I guess a is the equivalent of a domain in gettext thinking

I think <group> is a closer analog for gettext's domains. From http://docs.oasis-open.org/xliff/v1.2/xliff-profile-po/xliff-profile-po-1.2-cd02.html#s.detailed_mapping.domain:

If multiple domains are present in a PO file, it is recommended to group each domain in a element with the restype attribute set to x-gettext-domain and the resname attribute set to the name of the domain.

asmecher commented 5 years ago

...and of course, obviously, enjoy your vacation! This can wait until you're back.

asmecher commented 5 years ago

Hi @oscarotero! Got any feedback on whether/how we could add XLIFF group IDs non-disruptively to Gettext?

oscarotero commented 5 years ago

Hi, @asmecher

I don't like to add a new different Translation for this use case. One of the purpose of this library and all its extractors/generators is to be the more interoperative possible. And this new unitId is preserved only in Xliff contexts. For example, if I have a .xliff file and convert it to .po, the unitId info is lossed, so if I want reverse the operation and convert the .po to .xliff, the result will be different to the initial .xliff file.

All formats should be compatible with gettext and specifically, the .po format in some way, in order to preserve all data on import/export to other formats. (The name of this library is Gettext, so we should give priority to it :)

Due UnitId is not equivalent to msgid, because it's not required to be unique, a solution could be save the value in a comment. For example:

$translation->addComment("XLIFF_UNIT_ID: {$unitId}");

So, in a .po file, it'd be saved in this way:

# XLIFF_UNIT_ID: foo
msgid "Foo"
msgstr "Bar"

And the Xliff generator could search the UnitId in the comments or generate one automatically if it hasn't found it.

What do you think?

asmecher commented 4 years ago

Thanks for the feedback, @oscarotero. I've changed the implementation of the feature to suit what you've suggested; if that looks good, I'll add test coverage for it too.

One remaining question: do you foresee a function on the Translation class to permit easy access to the unit ID, or would calling code need to pile through the comments looking for a matching one? (If the latter, it would at least be nice to expose a regexp for matching them.)

oscarotero commented 4 years ago

@asmecher Thanks! I wouldn't extend Translation with a function that only makes sense in Xliff. What you can do is a static public function to get the unitID, so it can be tested and used in other places. For example:

//Get the unitID if exists
$unitID = Gettext\Generators\Xlif::getUnitID($translation);
asmecher commented 4 years ago

Thanks, @oscarotero -- I've added some changes per your comments above, plus a test case for XLIFF showing correct reading of the custom unit IDs and their preservation into PO.

asmecher commented 4 years ago

Excellent, thanks, @oscarotero! Do you have a date in mind for a release that includes this?