mozilla / pontoon

Mozilla's Localization Platform
https://pontoon.mozilla.org
BSD 3-Clause "New" or "Revised" License
1.46k stars 528 forks source link

Export terminology TBX file that includes translations from multiple locales #2574

Closed bcolsson closed 2 years ago

bcolsson commented 2 years ago

Currently, Pontoon can only export a terminology TBX file from a single locale at a time (example: https://pontoon.mozilla.org/ja/terminology/)

In order to make importing/updating terminology on other platforms for multiple locales scalable, we need the ability to export terminology translations for multiple locales in a single file. (Example)

mathjazz commented 2 years ago

Do we want Terminology of all locales to be included in the exported TBX file?

If not, one way forward would be to write a script outside Pontoon codebase that would download TBX files for all required locales and create a multi-locale TBX files out of them.

@bcolsson Thoughts?

bcolsson commented 2 years ago

@mathjazz I think that's probably the best way forward. Also means the script wouldn't be dependent on you. I'll start working on that. From what I can tell the script would:

  1. Send request to /terminology/[locale].tbx for each locale
  2. Parse the response (using XML? > if there's a better parser let me know)
  3. Merge locales
  4. Save as tbx
flodolo commented 2 years ago

This repository can probably give you some idea (it was done for TMX, but I don't know if anyone ever used it, so we might just stop the action and archive it…) https://github.com/mozilla-l10n/mt-training-data/tree/main/.github

bcolsson commented 2 years ago

Thanks! That'll help with the downloading part. I've got merging of multiple Pontoon files working. https://github.com/bcolsson/scripts/tree/tbx_merge/Pontoon/tbx_merge

The awkward thing is that Smartling won't support IDs that aren't generated at import, so can't use Pontoon's IDs. So I can extract everything, remove the IDs and start a new glossary easily. But updating of an existing Smartling glossary will require assigning the UIDs from Smartling to terms that have already been uploaded.

bcolsson commented 2 years ago

@flodolo / @mathjazz

Finished. Script can be found here.

If you have the bandwidth would appreciate a quick sanity check, but I've got it working with basic tests importing into Smartling.

mathjazz commented 2 years ago

Thanks, @bcolsson! I'll have a look.

Since we all have Pontoon scripts in our personal repos, I suggest we identify ones that should rather live in a common place like https://github.com/mozilla-l10n/pontoon-scripts (not created yet) or even https://github.com/mozilla/pontoon (for some).

This script is definitely a good candidate to go to such place.

flodolo commented 2 years ago

Great job 👍🏼

I left a few minor comments here. In general it seems like a good approach, I only wonder if the XMLCombiner class is too complex for our use case (not that it hurts, unless something breaks).

bcolsson commented 2 years ago

Closing as completed since we've got a working script (with review comments reflected) and a place to save them here: https://github.com/mozilla-l10n/pontoon-scripts .