compdemocracy / polis

:milky_way: Open Source AI for large scale open ended feedback
https://pol.is
GNU Affero General Public License v3.0
751 stars 173 forks source link

Continuous localization #320

Open patcon opened 4 years ago

patcon commented 4 years ago

Note: Previous experiments in continuous localization are deprecated. Links to self-hosted translation app no longer work. A new tool will be needed if there is desire to make localization process simpler.


Re-ticketed from https://github.com/pol-is/polisServer/issues/80#issuecomment-502395758

Continuous localization is the process by which localized translations for all website text can be done in a lightweight external platform designed specifically to support a breadth of translators, even those without tech affinities.

The interface often allows people to see the progression of a language set toward completion, including steps like "incomplete: needs translation", "in process: needs review", or "complete: approved". Critically, it makes it really easy to see how changes in one string, or addition of a new string, affects all the other language translations. People can check in and easily see if there's new work to do. Some systems include notifications to call translators back when there's new work to do.

System usually look for new strings in the source code, and when translators work in the external i18n system, it auto-generate new files in standard translation formats and auto-commit the translations back into the source code. This can be done nightly, hourly, weekly, or whatever.

Here's a screenshot of how one of the interfaces looks:

Screen Shot 2020-06-14 at 4 02 29 PM
The tools I know about: Tool Founding Year Open Source Free hosted tier
Weblate 2012 yes yes
Transifex 2014 n yes
Mojito 2016 yes n*

* can self-host on heroku

tags: internationalization, translation, i18n

patcon commented 4 years ago

Here's an example of Weblate's interface, so explore the gist of how it works: https://l10n.elementary.io/projects/installer/installer/

patcon commented 4 years ago

Right now, I'm mainly exploring how Mojito works. There's an active support community on Gitter who are helping me get it working on Heroku: https://github.com/patcon/polis-translations

Once it's set up, we just need to give the DATABASE_URL val (from Heroku's MySQL db) to a GitHub Action, and then it should be able to run a script to pull the translations and re-commit back into the git repo (or a branch of it)

patcon commented 4 years ago

The data might not persist there as I'm exploring how to sync data, but I succeeded in getting some strings parsed from our client-participation/js/strings/*.js type of files (with some minor reformatting to take advantage of their JS file-parser), and here's a screenshot of the Mojito interface at https://polis-translations.herokuapp.com/

Screen Shot 2020-06-16 at 2 06 12 PM

With GitHub auth, anyone can sign-in if they have a GitHub account. Other OAuth should be possible with some experimentation (e.g. using Twitter OAuth or maybe Google). There is only full translator access at this point -- no role-based restrictions within the Mojito interface.

patcon commented 4 years ago

Working on making the strings more processable by tools like Mojito. Doing that work here: https://github.com/pol-is/polisServer/compare/dev...patcon:320-continuous-localization

As part of that, I'm trying to figure out which strings are no longer referenced in the codebase. I created a quick script to make it more clear: https://gist.github.com/patcon/a1763d0a4ad901eed9c59c9554e9347c

The script basically cycles through each string key, and greps client-participation/ for s.{the key. Lines without results after (e.g. lines starting with ../..) are unused.

So this string howImportantLow is unused, and this string modSpam is used.

Some are less obviously usused, but are commented out, like this string commentErrorDuplicate.

These are all the strings I'm thinking can be removed:

@colinmegill Does it look alright? Any others you think can go easily? Happy to leave it at removing the obvious ones :)

(I'm not going to worry about the non-english files, since the translation system should deal with that later -- If the strings aren't in the english base locale file, they won't be regenerated in the non-english locale files)

patcon commented 4 years ago

OK, so here is the flow I'll likely propose for translations:

Note:

There will be two GitHub Actions workflows in use to coordinate with the translation management system (TMS) hosted on Heroku.

  1. .github/workflows/push-translations.yml: on every push to default branch:
    • github action will run that executes mojito push against the translation server, which looks at the base locale file en.js, and creates/deletes/updates the keys and base strings in the translation UI.
    • Any translators can login with a github account (anyone with a github account can make changes to non-base locale strings)
    • Translators will be able to check the translation UI for new or changed strings that need work in their language, and submit changes
  2. .github/workflows/pull-translations.yml: on schedule every night (hour? 4h?) a github action will run
    • it will checkout the codebase, then execute mojito pull to pull any contributions from the translation server, regenerating all the non-base locale files (everything in js/strings/ except base en.js)
    • using the create-pull-request GitHub Action, any changes in the translations will be used to generate a new PR into a specific and consistent branchname we reserve for translations, e.g. auto/translations.
      • if there a no changes, there'll be no PR created
      • If we haven't yet merged a previous PR for translations, the github action will simply append commits.
      • optional: since translators log into the translation server with their github accounts, we can likely attribute them in commit messages or commit authors, e.g. Added translations for fr-FR and da-DK, courtesy of users: @someuser @someotheruser! Thanks!.
        • They'll get nice notifications, thanking them.

In summary:

Eager to any and all feedback on the premise or any portion of this! cc @colinmegill @ballPointPenguin @urakagi @huulbaek @drewhart @virgile-dev @ricardopoppi

patcon commented 4 years ago

Oh, and from this, I hope we can start localizing strings for client-admin and client-report as well. A system like this will make it easier to manage updates with just a few point people supporting on translations ❤️ 🎉 🎉 🎉 🌮

patcon commented 4 years ago

The .github/workflows/push-translations.yml workflow seems to be working well.

I've added some demo translations for that new string in French and German. If things go well, then the .github/workflows/pull-translations.yml workflow should run on the hour. If it's successful, then a new PR should be created within patcon/polisServer repo, requesting the added translations be merged into 320-continuous-localization branch (for the purposes of this demo) 🤞

patcon commented 4 years ago

Yay! It's working! https://github.com/patcon/polisServer/pull/16 🎉

Candidate To Dos

EDIT: Openned PR in https://github.com/pol-is/polisServer/pull/345

patcon commented 4 years ago

Related Gitter convo (compensation, minnesota, indigenous localization): https://gitter.im/pol-is/polisDeployment?at=5eec0d49c223cc536a2141fa