mozilla / addons

☂ Umbrella repository for Mozilla Addons ✨
Other
127 stars 41 forks source link

Integrate / Automate l10n process #2818

Closed EnTeQuAk closed 7 years ago

EnTeQuAk commented 8 years ago

Pontoon unfortunately doesn't handle any notifications for us so we somehow have to continue sending the mail to the mailing list. Maybe a cron-script running every Friday or so?

Anything I'm missing in the whole process?

EnTeQuAk commented 8 years ago

It looks like pontoon supports RSS: https://l10n.mozilla-community.org/webdashboard/?locale=fr&rss

diox commented 8 years ago

Random idea: skip last two steps, have it run automatically somewhere using a github hook, when new commits are pushed/merged.

EnTeQuAk commented 8 years ago

another random idea: get inspired by how https://greenkeeper.io/ works and does PR magic.

diox commented 8 years ago

At the very least we should compile .mo files more often, or just on deploy. Currently we need to wait for the extraction every week to show new translations that have been added during the week, so if a translator commits a translation right away using pontoon we still don't get it until the next extraction! That makes it impossible to test recent translations on dev for instance.

mstriemer commented 8 years ago

I think there might be a reason that we've historically extracted once a week. I have a hunch it's that some translators use git directly and do a git pull when the email goes out, translate everything and then push it up somehow. Lots of extractions might mess with that flow. I'm not sure if pontoon supports them or we used to get PRs and now nobody does this or what.

Might be worth asking Matjaz if the 1 week schedule is required.

The greenkeeper style PRs sounds interesting. If we had a make l10n command that was idempotent that should be fairly easy in combination with the hub command.

EnTeQuAk commented 8 years ago

@mathjazz mind commenting on the comments above? Is it a problem to extract a lot more regularly or is that something that will mess up existing workflows for translators?

mathjazz commented 8 years ago

@EnTeQuAk I think everyone commits through Pontoon, because localizers don't have push access to this repo. We landed pretty hefty sync optimization in Ponton last week, so it's much faster now: https://pontoon.mozilla.org/sync/log/

Sadly, we expect AMO to be the slowest project to sync when merge happens, because all files for all locales change and the main optimization was focused around excluding unchanged files from the sync process. And AMO's PO files are huge. When can we expect the next extract/merge to see how it performs?

Also, how regular is "more regularly"? On every developer commit that changes strings? Which could be dozens of times a day?

As for the translator workflow, I think it's actually ok. We're already doing it on MDN and I think we'll start practicing it even in Firefox once we switch to a single l10n repo (release, beta, aurora, nightly, esr). The only thing we need to make sure of is that for bigger, more important and more visible changes we notify localizers and give them some time to translate.

EnTeQuAk commented 8 years ago

@mathjazz I can start an extract later today so that we see how it performs.

Hmm, given the nature that every added / changed string could potentially affect all locales I assume they'll all change if there's a change and I don't see anything to do about it. Of course it should be possible to exclude unchanged locale files (e.g if no strings changed and only headers got updated - I'm pretty sure I can filter that). If nothing changed and we can filter header-only changes (e.g only the extract date) I assume there won't be that many changes, looking through the commit log, there aren't that many commits that change translatable strings.

"more regularly" - to start with some kind of automation I'd say once a day, I think on every commit will result in many extractions if we don't inspect the diff and filter for actually changed strings which could be hard but I never tried.

Well, we can still continue to send weekly mails with what's changed and the translated/untranslated stats if we start automating it more and more.

mathjazz commented 8 years ago

Cool, let's see wait for the next extract to happen and sync and then it'll be easier to make any conclusions on Pontoon side of things.

EnTeQuAk commented 8 years ago

Just another note of things to note: "but maybe we should try to call / customize manage.py makemessages to replace most of omg_new_l10n.sh ? I'm pretty sure it would have handled that by default." (from mozilla/addons-server#2728)

diox commented 8 years ago

More random notes:

EnTeQuAk commented 8 years ago

@mathjazz fyi, we just merged another extract to master and it looks like the sync took ~2 minutes (https://pontoon.mozilla.org/sync/log/4744/) not neccessarily fast though but it appears it's in range with a few other bigger project syncs I see.

mathjazz commented 8 years ago

@mathjazz fyi, we just merged another extract to master and it looks like the sync took ~2 minutes (https://pontoon.mozilla.org/sync/log/4744/) not neccessarily fast though but it appears it's in range with a few other bigger project syncs I see.

That sync must have completed before you merged new changes. Let's wait for the next cycle that will start at the next full hour.

EnTeQuAk commented 8 years ago

oh, I messed up timings. alrighty :)

mathjazz commented 8 years ago

Every extract we reformat strings to put them on one line, and it looks like every pontoon commits does the opposite, wrapping them at some line length (probably 79 chars or so). That creates unnecessary large diffs

Yes, we should unify those, otherwise diffs aren't very useful. We use http://polib.readthedocs.io/en/latest/_modules/polib.html with the default wrapwidth of 78 characters. I checked most of our Gettex projects (amo, sumo, input ...) and they all wrap lines at around 200 characters.

Could you let me know the full command (makemessages with all parameters) you use to extract strings?

mathjazz commented 8 years ago

47 minutes. That's not great, but also not as bad as I expected. We can definitely run this once a day.

EnTeQuAk commented 8 years ago

And this was a particularly large extract so with a bit of luck this is the worst case. Thanks for the feedback!

On Tue, May 24, 2016, at 02:59 PM, Matjaž Horvat wrote:

47 minutes. That's not great, but also not as bad as I expected. We can definitely run this once a day. — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub[1]

Links:

  1. https://github.com/mozilla/addons/issues/2818
EnTeQuAk commented 8 years ago

@mathjazz sorry for the late reply, regarding wrapping things we're currently extracting and compiling with our "omg_new_l10n" script, this might change but for now the arguments are here: https://github.com/mozilla/addons-server/blob/master/locale/omg_new_l10n.sh#L23

mathjazz commented 8 years ago

@mathjazz sorry for the late reply, regarding wrapping things we're currently extracting and compiling with our "omg_new_l10n" script, this might change but for now the arguments are here: https://github.com/mozilla/addons-server/blob/master/locale/omg_new_l10n.sh#L23

Thanks @EnTeQuAk. Pontoon should wrap PO file lines at 200 characters from now on: https://github.com/mozilla/pontoon/commit/c6ce50e8638ddb70d2b9f6b0264ba2f3faa34acd.

Let's wait for the next extract and the following Pontoon commits to see if that actually helps reducing the diff.

EnTeQuAk commented 8 years ago

Awesome, thanks a lot!

On Tue, May 31, 2016, at 09:11 AM, Matjaž Horvat wrote:

@mathjazz[1] sorry for the late reply, regarding wrapping things we're currently extracting and compiling with our "omg_new_l10n" script, this might change but for now the arguments are here: https://github.com/mozilla/addons-server/blob/master/locale/omg_new_l10n.sh#L23 Thanks @EnTeQuAk[2]. Pontoon should wrap PO file lines at 200 characters from now on: mozilla/pontoon@c6ce50e[3]. Let's wait for the next extract and the following Pontoon commits to see if that actually helps reducing the diff. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[4], or mute the thread[5].

Links:

  1. https://github.com/mathjazz
  2. https://github.com/EnTeQuAk
  3. https://github.com/mozilla/pontoon/commit/c6ce50e8638ddb70d2b9f6b0264ba2f3faa34acd
  4. https://github.com/mozilla/addons/issues/2818
  5. https://github.com/notifications/unsubscribe/AAIfGYij04dVSa8cTp_M_ZkUxeIVSM5Sks5qG98lgaJpZM4H_Rn3
eviljeff commented 8 years ago

I'm clearing this milestone - please add one back when there is a commit that is likely to land.

EnTeQuAk commented 7 years ago

A few more things that came into mind:

EnTeQuAk commented 7 years ago

Another random note on this: https://bugzilla.mozilla.org/show_bug.cgi?id=1221552 (improved quality checks by Pontoon) is something to maybe follow and adapt once implemented or implement ourselves. Most of it (checking via dennis etc) is already implemented but urgently needs a refactor.

EnTeQuAk commented 7 years ago

Based on all the work done in https://github.com/mozilla/addons-server/pull/6243 I'm going to close this and open a separate issue about adding explicit automation in another issue. All that's basically needed is to configure travis correctly but let's have a few more extractions manually with the new script to see how things go and then turn on auto-mode.