Closed hsluoyz closed 4 years ago
My understanding is that the workflow software we're adopting will commit changes back to the project. Even if translation tweaks are modified outside of git, one of us still has to commit it, right?
What I mean is that direct modifications to the locale files are one of the most frequent tweaks made by end users. (We do have the custom locale plugin to help avoid this, but it's not universally used, nor very well polished.)
Ahh..... hmm. What about a cached file that could be cleared through the admin like template/css cache?
A PHP data cache would be fine, much as we already have for locale XML; the existing data reset tool would work for that as well. It would be a chance in expectations, though. If we synchronize it with the move to XLIFF, it might be something we could tuck into a new workflow without needing a second round of disruptions...
Files don't necessarily have to be combined into one in order to load them automatically -- either at once or on-demand. For example, an index could link keys to files and they could be loaded when an unloaded key is requested.
Thanks Nate. I didn't thought in this. :+1:
My understanding is that the workflow software we're adopting will commit changes back to the project. Even if translation tweaks are modified outside of git, one of us still has to commit it, right?
To clarify this: Yes, "commit back/forward to/from project" is one of the main goals the translation server needs to accomplish automatically, but... what else can be "done outside git"? I mean, changes will be in the translation server (by translators) or in git (by developers), isn't it?
A PHP data cache would be fine, much as we already have for locale XML; the existing data reset tool would work for that as well. It would be a chance in expectations, though. If we synchronize it with the move to XLIFF, it might be something we could tuck into a new workflow without needing a second round of disruptions...
Sorry... I'm completely lost here. @asmecher do you think you can simplify this for dummies? If is not possible... no worry. I can perfectly live without knowing about this. :-)
The remaining problems will probably be resolved by adopting crowdin et al.
Thanks @jonasraoni for your "two cents". They make sense to me but let's see what dev guys say. :-)
Only a comment. About crowdin, I pointed license issues here so this is why I encourage to use weblate instead.
Finally... @asmecher in parallel to the deep-Dev discussion ¿do you think it's safe to go with "stage 1"?
If is ok for you, with @MarcRiera we planned to start working on this during September so I hope we can clarify those questions) and do the weblate configuration/testing during this month.
If we can set the server, plus the code you wrote to make OJS understand XLIFF... seams feasible to announce the translation server at the PKPBCN19, isn't it?
¿do you think it's safe to go with "stage 1"?
I think it's safe to go ahead with the stage 1 proposal as Alec has described it (https://github.com/pkp/pkp-lib/issues/4779#issuecomment-524495563). All of our discussions are about how to improve things beyond stage 1.
do you think you can simplify this for dummies?
We're talking about how to automatically build a file that will tell us where to look for translations. So, for example, when the code hits __('my.locale.string')
, the application will know which locale file to parse and load. That way we don't have to load every translation file every time (which is not performant), but we also don't have to manually load the correct one (which is prone to mistakes).
The solution we're discussing regarding a PHP data cache is similar to how CSS and Smarty (.tpl
) files are built and cached, because rebuilding them for every page load is not performant.
If is ok for you, with @MarcRiera we planned to start working on this during September so I hope we can clarify those questions) and do the weblate configuration/testing during this month.
@marcbria, the plan outlined in the 2 stages removes the need for us to check whether Weblate works with symbolic keys -- I think we're OK on that front. But if you could run a quick Weblate test with e.g. the French XLIFF samples I've generated, that would be excellent -- the only thing we need to ensure is that Weblate will preserve the <unit id="...">
attribute while editing.
We're talking about how to automatically build a file that will tell us where to look for translations. So, for example, when the code hits __('my.locale.string'), the application will know which locale file to parse and load. That way we don't have to load every translation file every time (which is not performant), but we also don't have to manually load the correct one (which is prone to mistakes).
Muuuch more clear now. Thanks @NateWr. ;-)
the plan outlined in the 2 stages removes the need for us to check whether Weblate works with symbolic keys -- I think we're OK on that front.
But are we completely sure about moving to PO (stage2)? Even nextcloud (IMHO one of the best php developments ever) is avoiding PO and is going to JSON.
And yes... XLIFF was originally though for "transportation", but in our case will mean a "minimal" change from our native xml format, other projects started this way and we can talk with the OASIS XLIFF fellow to ask them to encourage people walking this way.
If you three (@asmecher , @NateWr and @ctgraham) are sure about this I won't ask again, but I can't stop thinking we are moving in the wrong direction.
But if you could run a quick Weblate test with e.g. the French XLIFF samples I've generated, that would be excellent -- the only thing we need to ensure is that Weblate will preserve the
attribute while editing.
Thanks you both. We can do this test but I think @MarcRiera did the job before and said the key is preserved.
About weblate, I was concerned about all other features we pointed as a requirement (push/pull, workflows, permissions/roles, glossaries, translation memories... that we wrote somewhere but I can't find it now. @mtub do you have a copy somewhere?)
If we can set the server, plus the code you wrote to make OJS understand XLIFF... seams feasible to announce the translation server at the PKPBCN19, isn't it?
You probably intentionally missed to answer this question? ;-)
I found the notes Marco took in Heidelberg: slides.html.txt
And here we made a comparative between weblate and transifex: (that include requirements) https://docs.google.com/spreadsheets/d/1rSp350oJEEb6PYOfjpMzQnlTNGOH_UiJDpZU2GbXWbE/edit#gid=2143124171
If we can set the server, plus the code you wrote to make OJS understand XLIFF... seams feasible to announce the translation server at the PKPBCN19, isn't it?
You probably intentionally missed to answer this question? ;-)
Yup, I'm planning to include some XLIFF-compatible tweaks that interested parties can experiment with for the 3.2 release.
Gettext library feature add for supporting XLIFF unit IDs is now merged: https://github.com/oscarotero/Gettext/pull/221#event-2634582256
Another blocker, unfortunately :) https://github.com/oscarotero/Gettext/issues/224
Latest update:
I'm tinkering with both using Weblate because Weblate manages both monolingual (symbolic locale keys) and bilingual (main language in source code, mapping from there to secondary languages) modes for both file formats. (See https://docs.weblate.org/en/latest/formats.html for the list.)
Command line to batch-convert XLIFF:
for locale in `ls locale`; do for file in `fgrep -l locale.dtd locale/$locale/*.xml | cut -d "." -f 1`; do php lib/pkp/tools/xmlToXliff.php `echo $file.xml | sed -e "s/$locale/en_US/"` $file.xml $file.xlf; done; done
for locale in `ls lib/pkp/locale`; do for file in `fgrep -l locale.dtd lib/pkp/locale/$locale/*.xml | cut -d "." -f 1`; do php lib/pkp/tools/xmlToXliff.php `echo $file.xml | sed -e "s/$locale/en_US/"` $file.xml $file.xlf; done; done
I'm still favouring XLIFF because our XLIFF files are bog-standard, rather than stepping outside the spec, as monolingual PO files do (even if they're used in some projects in practice).
Yikes, it looks like Weblate may not support XLIFF 2.0!
Commands to batch-convert XML to PO:
for locale in `ls locale`; do for file in `fgrep -l locale.dtd locale/$locale/*.xml | cut -d "." -f 1`; do php lib/pkp/tools/xmlToPo.php $file.xml $file.po; done; done
for locale in `ls lib/pkp/locale`; do for file in `fgrep -l locale.dtd lib/pkp/locale/$locale/*.xml | cut -d "." -f 1`; do php lib/pkp/tools/xmlToPo.php $file.xml $file.po; done; done
A teaser :)
@asmecher this is impressive!! It's almost finished! It's too crazy thinking that we will be able to make OJS3.2 es_ES and ca_ES transaltion over weblate, isn't it? :-)
I didn't find time to contact @MarcRiera and make the tests we promised. I will be a little more relaxed at the end of the month...
Cheers, m.
@marcbria wrote:
I didn't find time to contact @MarcRiera and make the tests we promised.
Because https://github.com/oscarotero/Gettext works with XLIFF 2.0, but Weblate seems to only work with XLIFF 1.2, I've chosen (at least for now) to focus on "monolingual PO" as our chosen format. So if you were planning to experiment with the sample XLIFF, I'd suggest holding off on that for now. Here are some sample PO files -- Weblate appears to work well with them in monolingual mode.
I missed this one: :-(
Yikes, it looks like Weblate may not support XLIFF 2.0!
So this other one made me think you were now focused and succeed on XLIFF:
I'm still favouring XLIFF because our XLIFF files are bog-standard, rather than stepping outside the spec, as monolingual PO files do (even if they're used in some projects in practice).
I just ask in weblate github if they are planing to support XLIFF 2.0 anytime soon.
Otherwise, I'm unsure about the options we have here: a) move directly to PO. b) look for a different free software translation server. c) find how to downgrade to 1.2. d) ...
Looking into the differences between both XLIFF specifications the downgrade (c) will be complex. We look deep but [1] we didn't found any good alternative free soft (b) to do the job... so, does it mans we need to go with (a)?
Please Alec, let us know if we can help with something.
[1] mojito looks promising and supports xliff 2.0, but it's still very simple compared to weblate.
I just ask in weblate github if they are planing to support XLIFF 2.0 anytime soon.
Here's an already-open issue for XLIFF 2.0 support in Weblate: https://github.com/WeblateOrg/weblate/issues/972
I'm OK to go with a) move directly to PO, as long as everyone understands that we're going to be using monolingual PO files rather than bilingual PO files. This is not how PO files were initially intended, but there are projects that use them this way, and Weblate includes support for it.
Please Alec, let us know if we can help with something.
Yes, if it's possible to start putting together a production-capable Weblate install for us to use, that would be very helpful :)
We have multiple options here:
Witch do you like best? If we go with a docker approach, and we decide to move from one place to other, the migration it's supposed to be trivial.
Cheers, m.
Ok... I couldn't resist the temptation. Server with last weblate version is up and running at: http://revistes.uab.es:8081
Sending by mail the login credentials to you as well as some indications about the docker configuration. We still need to setup the git push/pull feature (in confidence, I have no idea about how it is supposed to work), but we can worry about this after isn't it?
BTW, if everything is as advanced as you show, I offer myself and my team as guinea pigs to make the es_ES and the ca_ES OJS 3.2 translations over the brand new server.
See you soon in Barcelona, m.
BTW, looks like XLIFF 2.0 is not implemented a widely and there is not backwards compatibility to XLIFF 1.2 so PO solution is the more standard.
I like a lot XLIFF (even I'm still surprise it is only used as a "transport" format and only a few are using it natively) but the fact is that only a free CAT tools suppport XLIFF 2.0 so IMHO won't be a good idea work with xliff 2.0 if our translators are not able to work with their favourite tools external.
I missed one question you made in a former post:
I'm OK to go with a) move directly to PO, as long as everyone understands that we're going to be using monolingual PO files rather than bilingual PO files. This is not how PO files were initially intended, but there are projects that use them this way, and Weblate includes support for it.
I think we are fine with this (as you said, some projects work in this way), but let me ask @MarcRiera if it's a safe road.
@NateWr and I discussed how to stage this out and roughly decided:
@NateWr, for step 1, could you look at... https://github.com/pkp/ojs/pull/2479 https://github.com/pkp/pkp-lib/pull/5107
(Obviously I'll generate PRs for OMP and probably PPS once we're ready for a merge.)
This looks great, with a remarkably small impact on the codebase outside of the locale files. :+1:
One question I had was how editing will work during development. Will I modify the en_US
po files myself, similar to how its done now with the XML files? Or do these need to be generated from something?
Also, is there any tooling (po, gettext, weblate, etc) that will automatically identify changed/removed en_US
strings, so I don't have to delete these from other locales when committing changes?
Translation round for 3.2 using weblate! @asmecher if we manage to do it before PKPBCN19 I will pay all the beers you can drink during after the sprint. (not before because during the sprint we still need your brain) ;-)
Any update?
Server installed. Working on configuration. Code upadated. Testing soon. Goal? Translate OJS 3.2 with weblate. Follow this thread for detailed info: https://github.com/pkp/ojs/pull/2479
Notes to self: On using import_json to create translation components...
<?php
/**
/ $output = []; array_shift($argv); // Take PHP script filename off the top $repo = array_shift($argv); foreach ($argv as $arg) { $pieces = explode('/', $arg); if (($index = array_search('plugins', $pieces)) !== false) { $slug = $pieces[$index + 1] . '-' . $pieces[$index + 2]; } else { $slug = str_replace('.po', '', array_pop($pieces)); } $output[] = [ 'slug' => $slug, 'name' => $slug, 'file_format' => 'po-mono', 'filemask' => str_replace('en_US', '', $arg), 'vcs' => 'git', 'repo' => $repo, 'branch' => 'master', 'template' => $arg, 'license' => 'GNU General Public License v2', ]; } echo json_encode($output);
2. Execute this tool, saving the results to a JSON file:
git ls-files *.po | grep en_US | xargs php tmp.php https://github.com/pkp/ojs > components.json
3. In the Weblate install, paste this into a local file and run the Weblate tool on it:
weblate import_json /tmp/tmp.json --project "open-journal-systems" --ignore
4. Wait.
Also, is there any tooling (po, gettext, weblate, etc) that will automatically identify changed/removed
en_US
strings, so I don't have to delete these from other locales when committing changes?
The support for monolingual Gettext in Weblate includes the "needs editing" status (in addition to "untranslated" and "translated"). See right-most column in the table of "Translation types capabilities" https://docs.weblate.org/en/latest/formats.html#translation-types-capabilities
This is super important to keep translations consistent with changes in the base English version: https://docs.weblate.org/en/latest/workflows.html#translation-states
The only requirement is setting en_US as monolingual base language file in Weblate:
For correct use of monolingual files, Weblate requires access to a file containing complete list of strings to translate with their source - this file is called Monolingual base language file within Weblate, though the naming might vary in your application.
https://docs.weblate.org/en/latest/formats.html#bilingual-and-monolingual-formats
Any update? Is there any chance that we can start to translate the .PO files first? I think it is compatible with the later Weblate?
@veotax, what translation are you interested in working on?
Chinese (zh_CN). Is it ready to be translated now? And what project (ojs or pkp-lib) and what branch (master or stable-3_1_2) should I work on and send PR?
Can I copy the .po files from another folder like en_US to zh_CN and then translate the words? Or is there another process?
@veotax, excellent! The 3.1.2-x releases will continue to be in our old .xml
format, but version 3.2 and onward will use monolingual .po
files (supported by Weblate). I would recommend targeting 3.2 (due for release early next year) and using .po
. I have converted only selected languages in the master
branch to .po
, but if you're ready to begin working with Chinese in that format, I can convert it as well. Just let me know! There is already an existing zh_CN
translation, it just needs to be updated.
So I should use master branch.
What do you mean by I can convert it as well.
? You mean you have a tool/script to generate .po files from existing .xml files? If yes, please do it. The original zh_CN .xml files already miss some words (not translated words in UI). So I think the generated .po files also miss them, right? So I need to translate them.
@veotax, I've just converted the .xml
files over to .po
for the zh_CN
locale. (There's a tool for this in lib/pkp/tools/xmlToPo.php
in the master
branch, which will be released as OJS 3.2.)
For the .po
files to work, you'll need a full checkout of the master
branch, rather than an existing OJS 3.1.2.x release.
The commits with the file conversions are here: https://github.com/pkp/ojs/commit/831f4a386ef56ec68e407bd0eef42f108af64c5f https://github.com/pkp/pkp-lib/commit/57ccd97f7c42c9e31c8061be30a3921352a8f565
@asmecher: should emailTemplates.xml be converted to PO, too?
@fgnievinski, no, that's a different XML dialect; I'm still considering what best to do with that. It probably makes the most sense to convert it to .po
as well, but we would need a mechanism to link email keys (e.g. NOTIFICATION
) with email body, subject, and description for each language.
@asmecher check your mail and confirm weblate server is working, please. ;-)
@marcbria, check Slack :) Too many venues!
@asmecher still about emailTemplates.xml, how about replacing everything between <email_text key="NOTIFICATION">
and </email_text>
for {translate key="email_text_key_NOTIFICATION"}
? The HTML tags can be left inside the localization text, we translators are used to deal with those.
@fgnievinski, we may well end up doing something like that. Thanks for the suggestion!
@asmecher is there any way to list all un-translated words together in the PO file? I found the converted zh_CN PO files (https://github.com/pkp/pkp-lib/commit/57ccd97f7c42c9e31c8061be30a3921352a8f565) don't contain all the words. Currently, I have to copy each untranslated keyword from web UI (sometimes uncopiable) to PO and translate it. It's too slow.
@veotax, Weblate will help with that (and I suspect other translation tools capable of working with monolingual PO files as well). They'll do that by fetching the full list of locale keys from the English locale files, then comparing them with your translation to determine what's missing.
Thanks. So before our official Weblate is online, can you recommend some tool (local or web-based) that I can use to start to translate painlessly?
@veotax, our XML-based translation toolset used to do this, and I'm sure Weblate does, but I haven't tried other tools.
If you are working over the OJS native xml format, the only tool is the OJS translation plugin.
If you are working in the new PO files, you have plenty of them. I suggest you two:
Here you have an article with a list of the most usual ones:
It's still soon and some research need to be done, but I'm planning to encourage my translators to work offline if they are working in big translations. Weblate will be also great, but when you are doing a looong work, the web lag will kill your patience. Desktop tools include more features, are faster and when you finish, (hopefully) you will be able to upload the results to weblate.
Converting email templates to the PO format!
PRs:
This preserves the old XML format (locale/en_US/emailTemplates.xml
), but replaces the (localized) contents with {translate ...}
calls, e.g.:
<email_text key="NOTIFICATION">
<subject>{translate key="emails.notification.subject"}</subject>
<body>{translate key="emails.notification.body"}</body>
<description>{translate key="emails.notification.description"}</description>
</email_text>
Then the translations themselves come from a new PO file, e.g. locale/en_US/emails.po
for English.
There's a new conversion tool to help with this in lib/pkp/tools/xmlEmailsToPo.php
. It generates the new PO file and changes the old files over to {translate ...}
calls.
@marcbria, could you take a quick look? Does this seem like a workable approach? If so, I can merge and set up a new "Emails" component in Weblate.
Currently, all translations are done in XML files, like mentioned in: https://github.com/pkp/pkp-lib/issues/4029#issuecomment-417907420, which is very inefficient for translators to translate, or sync a few of items between dozens of XML files.
Is there any chance to use a more advanced online translation platform like: https://crowdin.com/ ? In crowdin, all translators only need to do the translation in web browser, and no need to track which words have not been translated yet. The translation will be deployed automatically with a new git commit. Can we consider it?