php-gettext / Gettext

PHP library to collect and manipulate gettext (.po, .mo, .php, .json, etc)
MIT License
687 stars 134 forks source link

Expose XLIFF unit IDs in Translation objects #220

Closed asmecher closed 4 years ago

asmecher commented 5 years ago

XLIFF supports unit IDs:

<unit id="my-unit-identifier">
 <segment>
  <source>English text</source>
  <target>Translated text</target>
 </segment>
</unit>

However, the XLIFF parser doesn't capture/store these IDs in the Translator object.

Is it possible to add support for representing unit IDs in the Translator class?

asmecher commented 5 years ago

I'd be happy to suggest a PR for this but I'm not sure what the Translator object and the rest of this library's model expects for ID uniqueness!

oscarotero commented 5 years ago

Hi, @asmecher

This library is heavy inspired in gettext model, so the uniqueness of each translation is defined by the context + original text (here: https://github.com/oscarotero/Gettext/blob/master/src/Translation.php#L79). Currently, in Xliff the id attribute of the units are generated automatically with the context and source text in md5 (https://github.com/oscarotero/Gettext/blob/master/src/Generators/Xliff.php#L38) but you're right, the extractor does not store this value.

What is this id for? Is it used to get the translation from the translator (example: __('my-unit-identifier') ?

asmecher commented 5 years ago

What is this id for?

In our software we embed a synthetic key in the source code (e.g. registration.form.error.usernameNotUnique) rather than the English text ("The specified username is already in use. Please choose another.")

So rather than changing everything so that English is embedded in the source & we map from there to e.g. Spanish, I'm hoping we can use the unit ID attribute in XLIFF to represent the synthetic key.

asmecher commented 5 years ago

This is a fairly boneless way to add unit ID support for both reading and writing, without otherwise changing the library. It's probably not the best way -- I only submit it for discussion. I'm hesitant to work with the getId/setId/generateId functions here because XLIFF doesn't require uniqueness of IDs and I suspect the rest of the library does. https://github.com/oscarotero/Gettext/pull/221

asmecher commented 4 years ago

https://github.com/oscarotero/Gettext/pull/221 was merged so I suppose this issue can be closed, thanks!

Unfortunately there's a knock-on issue but I've just filed it separately (see link above).