Trustroots / trustroots

Travellers' community for sharing, hosting and getting people together.
https://www.trustroots.org
GNU Affero General Public License v3.0
398 stars 135 forks source link

Support multiple languages (i18n) #492

Closed nicksellen closed 3 years ago

nicksellen commented 7 years ago

Might get a lot more support if people can use the site in their native language.

angular-translate is a great, mature library for angularjs.

For managing the translations I just started trialling crowdin for a blog. Seems great! And they are into open source [1].

Another option is Transifex which we use for foodsaving.world - see our dashboard. But they have gone a bit cold on open source projects (there used to be a whole page on their website [2]), but now just a bit on the pricing page, and a few more criteria than just open source:

We offer Transifex for free to Open Source projects that have no funding, revenue, and/or commercialization model.

[1] https://crowdin.com/page/open-source-project-setup-request [2] https://www.transifex.com/customers/open-source/

nicksellen commented 7 years ago

Crowdin has direct github integration (creates PRs for you), for transifex you use a cli tool to push/pull translation files (or run an integration thing yourself on a server - much more cumbersome).

simison commented 7 years ago

Thanks, the basics of i18n setup is pretty straightforward indeed but it causes quite a lot of extra work down the road and extra work at this point is what we really need to avoid. :-)

Frankly it doesn't really bring important numbers (reply rates + member count) up so this won't be a priority at this point, but definitely something in the future! First we need to get usage stats into healthy basis.

simison commented 7 years ago

(Just writing these down for later)

Usable stats to study for decision making regarding i18n:

kalpaitch commented 7 years ago

Adding another :)

nicksellen commented 7 years ago

a lot of extra work down the road

For who? Translators themselves probably exist in the community, it's a great thing that non-developers can contribute.

Frankly it doesn't really bring important numbers (reply rates + member count) up

You suspect it won't, but of course you need to measure to know :p

The BeWelcome stats show most users are using it in countries where English is not the national language.

The stats can only show the current and past behaviour, not what would happen if translations were in place.

simison commented 7 years ago

a lot of extra work down the road For who? Translators themselves probably exist in the community, it's a great thing that non-developers can contribute.

For anyone managing the volunteering. Crowdsourced translations need replying a lot of emails, instructing people, giving notices when new features/string changes are pushed out, as well taking into account languages in design process (ltr/rtl, different widths).

That was my experience with several big crowdsourced i18n projects. Probably lots of things to do different for whole process go smoother (or just finding someone to deal with that all).

My thinking goes more along the line that same amount of effort put to something else makes bigger impact, like references.

I like i18n and love to see projects like this translated to crazy languages, we'll do it eventually for sure. Also as far as feature requests we see in the feedback, i18n comes up quite rarely. References and something to do at the site if you're not hosting are mentioned frequently in feedback.

The stats can only show the current and past behaviour, not what would happen if translations were in place.

I think the most feasible mvp to gather some data would be to translate just the landing page.

Thanks for brainstorming next features! It's great.

nicksellen commented 7 years ago

For anyone managing the volunteering

Thats what crowdin does :) And can ignore rtl languages for some time I think...

same amount of effort put to something else makes bigger impact, like references

That assumes developers are generic resources to be deployed anywhere on the project. It might be quite feasible to get a new contributor who joins specifically to add translation support as it's understandable, and self contained kind of task (experience: doing a talk about https://foodsharing.de at a meetup and having someone come up afterwards offering to translate it).

Could be a notable PR push too ("trustroots launches in Germany|France", etc...).

I'm not proposing doing any of those things though, so it can sit here until someone is :)

simison commented 7 years ago

That assumes developers are generic resources to be deployed anywhere on the project. It might be quite feasible to get a new contributor who joins specifically to add translation support as it's understandable

That's true, tho we should try to get everyone work towards one vision and not implement stuff just because someone absolutely wants to implement it. ;-) That path leads to bloat.

guaka commented 5 years ago

I removed [Priority] Low - I feel accommodating people who don't speak English is very important for growth (plus helping with translation is a great way to participate for people who want to help grow Trustroots)

mrkvon commented 5 years ago

Recent discussions on slack and forum suggest that this will be implemented during the migration to react.

Also on forum.

So far I've seen libraries react-i18next and react-intl. Anybody has an experience with any of these?

The latter seems to be more popular (github stars and npm weekly downloads), but the former one is close behind.

mrkvon commented 5 years ago

Seems to me that this is still far from being resolved. I'm reopening until we write followup issues.

nicksellen commented 5 years ago

Maybe the points on https://github.com/yunity/karrot-frontend/issues/1118 are useful to consider.

comradekingu commented 5 years ago

@nicksellen I find Crowdin to be a subpar tool, allowing no quick interface to see the relation between prior strings and their translation. Also taking away from its appeal is that it is closed source. That speaks volumes in terms of preference, or hypocrisy. I would suggest using Hosted Weblate, which is both libre software, and gratis for other libre software. Transifex is not even worth the time of day to complain about it, speaking as someone who has used it for hours, every day since 2013, and not since they last updated their EULA/terms of use.

nicksellen commented 5 years ago

@comradekingu Weblate is a great project :) We are quite interested to use it for karrot actually, so we can get a bit more control over our translation process, and like you we much prefer using open source options. We would likely self host it as this is not so tricky for us to setup and we have the server capacity.

I would caution against your open source purity argument a bit though, it's nice to try and encourage and support people to use open source alternatives, but there are many reasons why people choose to use non open source software and they are free to do so! (and often have good reasons to do so).

If we focus on the negatives of propriety software then maybe we receive a more defensive reaction about the negatives of open source software too (often open source software is subpar, and harder to use/configure for the normal user - but we accept this for the increased freedom, but cannot force others, only encourage and support).

It's rarely as simple as propriety=bad, open source=good. I'm sure you wouldn't go as far to say Weblate solves everyone's problems with translations software ;)

So I would really appreciate less criticism like "That speaks volumes in terms of preference, or hypocrisy" and more "How can I share my enthusiasm for open source translation and encourage you to try it out!".

comradekingu commented 5 years ago

My enthusiasm one way or the other I would consider beside the point. Lets set aside how I frame/d the argument, and get to the meat of things. Fortunately Weblate has its merits in isolated terms both technically and ethically. The combination of the two I would say greater than its parts. As far as a problem one could feasibly solve on Crowdin, but not on Weblate, I don't see what that would be. Crowdin does not share my preference for software freedom, and I am not saying it should, just making a point for those that do want to share in the line of thinking.

Crowdin, thy name is hypocrisy, and this is an excellent point to make. On one hand they speak of "open source", citing the OSI for their approval of licenses, that being the "Open Source Initiative". Curiously they also say "Open Source", which is not a good look for a localization platform in terms of consistency, but more importantly, this is the specific trademark that the OSI holds. Open Source in the OSI sense means libre software, but then we read point 4. of the Crowdin (the closed source software) terms for Open Source project inclusion:

You do not have Commercial products related to the Open Source project you are requesting a license for.

So what they are saying is that it has to be non-commercial, which is only a subset of libre software. That is not compatible with the term libre software / Free Software, or Open Source, and/or it is not a way to "be into open source".

So Crowdin lines up (cold) shoulder to (cold) shoulder with Transifex in this respect.

noahsmindfuck commented 5 years ago

possible volunteer for french translation: https://trustroots.zendesk.com/inbox/conversations/2792 The person is already a french translator at bewelcome.org

noahsmindfuck commented 5 years ago

possible portuges translator: https://trustroots.zendesk.com/inbox/conversations/2832

mrkvon commented 4 years ago

@nicksellen wrote elsewhere:

If we extract to json then convert the json to gettext, we presumably cannot make use of any of the extra features that gettext provides (looking at https://docs.weblate.org/en/weblate-3.10.3/formats.html#translation-types-capabilities for the extra features).

Perhaps the backend can write .po files directly, and potentially include extra context? Comments and source location seem potentially useful, at least this is something that we ran into with karrot, where the strings sometimes lack context, and the translation ends up not quite right.

This is an issue which we can run into, so the comment is linked here not to get lost.

simison commented 4 years ago

Convo about translation process and tools: https://trustroots.slack.com/archives/C08SENA9Z/p1583915629082800

Summary here:

the current direction is for the text to end up as .po files in the git repo (there are a few related open PRs about i18n stuff --> https://github.com/Trustroots/trustroots/pulls?q=is%3Apr+is%3Aopen+label%3Ai18n)

Tools we're looking into:

Thoughts on technical aspects

Mikael

Generating pot files on commit is ok first step (like @mrkvon implemented in https://github.com/Trustroots/trustroots/pull/1289 https://github.com/Trustroots/trustroots/pull/1291, it's simple enough for starters) In longer term I'd be inclined to move it to separate from PRs so that there is less to commit.

Thoughts about the process, a couple experiences:

Martin:

In my experience there would normally be three roles involved: the project manager, the translator and the reviewer. I’ll give a short description below, hopefully not making it too much of a novel. The PM makes sure that the files to translate gets on the localisation portal, sets priorities (if any), handles file updates and any/all administrative issues that might pop up. Often they also take care of context/clarification issues (for example, the word “login” can both be a verb and a noun in English, meaning that context is needed if it is in a lone segment). The PM does not need to speak the language being translated, so one person can take this role for all languages. The translator has a fairly straightforward job. After opening the file there are a number of segments (strings) that need to be translated, divided in to columns, source (original language) to the left, and the target (translated language, to be filled in) to the right. There are two main tools used, normally neatly incorporated in the user interface. First there is the termbase (TB), which works like a live glossary. If the source contains a direct or partial hit for a TB entry, the translation for the same shows up so it can be used for consistency. Secondly there is the translation memory (TM), which contains all previously translated strings. This can both be used for reference searching and for automating the translation. This last part means that if there are two segments that are exact matches, the TM will auto-populate the second segment, and thus the translator only needs to translate it once. If there is a partial match (some words changed, such as if an old file has been updated a little), the TM will suggest the closest match while the formatting shows which words differ. As the translator works, a bar on the project page shows the process, and when it gets to 100 % the proofreader can take over. If the translation is well made, the proofreader/reviewer only needs to read through the text to make sure that it flows well, that no misunderstandings have occurred, and that there are no small typos. Generally speaking they would also be “responsible” for maintaining a conversation with the translator, for example to discuss what terms to use, what to avoid and so on and so forth. This conversation tends to go both ways at the end of the day, and there is nothing that says that you cannot have two people shifting between translator/proofreader, it is just better to have a second pair of eyes to go through a text as it will almost always improve it. In the case of Trustroots I guess that the TB will not be all to relevant, so there will be no need to keep it “clean”. As for the integration between the localisation tools and the project, that is all left to the PM (or somebody they forward the files to). As a translator/proofreader you basically log in, do a very straightforward task (that is, no multitasking, little organisation, all focus on the one job) and then call it a day. So, a bit to read, but hopefully that would give an idea of the workflow after the technical part is sorted.

Mikael:

In volunteer projects I've often kept it more simple, but it has quality implications:

  • no PM, strings just get updated to tool where several people have access to. Context is provided in code as comments, so nobody checks if it's there.
  • translators (many) just translate, then someone native in that language "accepts" or proof reads.
  • in small languages just one person does that all
  • code wise, each new change or deploy wipes out old translations and English strings to production right away. Keeps it simple but creates experience where English strings appear in middle of translated page, which is bad experience for users.

The process Martin outlined is more ideal for sure. With addition of gated feature rollouts do that we have time to get things translated, or some other strategy to ensure no untranslated sections go live.

Martin:

Of course, the roles can be joint/divided in any which way really, but knowing them separately could help in creating a good workflow. For localisation I think that it will be easier to find volunteers if the instructions are simply Follow this link > Open the files > Translate/review, meaning that the underlying structure of before/after would probably be best off if organised to allow this. Such a solution also makes sure that people that work with implementation (taking the “PM” role) does not need to know the languages whatsoever. I’m thinking I am more or less saying the same thing as Mikael, but if considered properly I think that getting an “ideal” structure for localisation in a smaller project such as this would not be all too hard. (That said by a person who knows relatively little of programming and what happens with the texts once localised, so take it with a grain of salt..)

Akronix commented 4 years ago

Thank you for the update @simison.

Should we set a deadline so we apply to the other options: transifex / crowdin ? @nicksellen what about the other self-hosting option? do you think you will look into it before the hackweek?

comradekingu commented 4 years ago

Transifex: Roles are fundamentally flawed as the only option for quality, because no translation manager has the capacity to hand-pick people from every language to ensure everyone meets certain expectations. Even then, that isn't ideal. In terms of overview, preventing malice is next to impossible on Transifex, and nobody has the kind of (extra) time to even try. Is not a friend of libre software. Spies on the user.

Crowdin: The voting system is similarly fundamentally a broken idea, meaning good changes just sit around waiting for a third person to acknowledge that the first person had no clue. Can't actually see source strings next to translations either, so getting a grip on what is going on is next to impossible. Many restrictions and requirements to get hosting without paying. Spies on the user.

Weblate: The best overview, however limited it is. The best checks, and many editors and views. Can do roles and adjusted voting, however it is still fundamentally broken, because all you need is to make n+1 accounts to be malicious. Does not have all the bells and whistles of Pootle, but is more actively maintained, and better for novice users.

Happy to see https://hosted.weblate.org/projects/trustroots/ up and running. Just need something to translate :)

Akronix commented 4 years ago

Happy to see https://hosted.weblate.org/projects/trustroots/ up and running. Just need something to translate :)

@comradekingu that is mean that we have a hosted instance in weblate? They didn't reply my request or any of my emails...

simison commented 4 years ago

Was testing Poeditor.com a little bit — it's also free and unlimited for open source projects. Seems to have very good formats support and what seems like pretty good automation with pulling/pushing strings from/to git repo.

Good to keep in mind in case we prefer one over the other.

comradekingu commented 4 years ago

@Akronix Where did you send the e-mails? Lets try to ping @orangesunny.

Poeditor.com is only a webservice, subject to the terms poeditor.com wants to have at any given time. One of which is having Google analytics running right now. Closed source as-a-service-only with bulk data collection hasn't panned out before, I don't see why it would in the future.

simison commented 4 years ago

@comradekingu thanks! I find your input here valuable. Poedit is just another tool to get the job done. We use plenty similar ones to get tasks done and lower the burden on us. ;-) I feel like the important part of equation is always if they force you to vendor lock or if you could use something else practically overnight. Did you mean to suggest vendor-lock to be the case with them? I generally like when companies are very supportive of open source even if they aren't full-on open source themself. ;-)

simison commented 4 years ago

Thanks for the summaries @comradekingu ! 🙌

Weblate: The best overview, however limited it is. The best checks, and many editors and views. Can do roles and adjusted voting, however it is still fundamentally broken, because all you need is to make n+1 accounts to be malicious. Does not have all the bells and whistles of Pootle, but is more actively maintained, and better for novice users.

Curious about "fundamentally broken" part. Can you elaborate little more? How would someone be malicious? What bells'n'whistles would you see are the biggest ones missing? What makes it better for novices?

Akronix commented 4 years ago

I first sent the request to weblate through the indicated form in the website. After getting no reply, I forwarded the request to this email: info@weblate.org @comradekingu I sent you an invitation to our slack chat. So we can follow the conversation there maybe?

orangesunny commented 4 years ago

Hi all, I got all your email, my bad, I am sorry.

But! :) We have migrated Hosted to the new datacenter last weekend and everything if faster now, me included. You will be set up in a few minutes.

Thanks for your amazing project and happy translating!

orangesunny commented 4 years ago

I think this can be closed as it’s up and running: https://hosted.weblate.org/projects/trustroots/

@Akronix, please check the settings and alerts in the project, so you are 100% happy with the workflow. If anything, drop us a mail again ;)

comradekingu commented 4 years ago

There are no strings in added languages. If you add https://hosted.weblate.org/user/kingu/ as a moderator I can have a look.

orangesunny commented 4 years ago

@comradekingu Thanks for the check!

This time, we need help from the devs @Akronix @mrkvon @simison @nicksellen as most of the files are empty like this one https://github.com/Trustroots/trustroots/blob/master/public/locales/en/translation.json

mrkvon commented 4 years ago

@orangesunny Thank you! For having us on hosted.weblate.org and for your support in getting us up and running.

most of the files are empty

Indeed. We have infrastructure in place for extracting translations, but we're still figuring details how everything should work together.

Translators, please be patient with us! :bowing_woman: We're getting there!

orangesunny commented 4 years ago

@orangesunny Thank you! For having us on hosted.weblate.org and for your support in getting us up and running.

Happy to do it!

Indeed. We have infrastructure in place for extracting translations, but we're still figuring details how everything should work together.

🤞 You may also find some help in our docs, like here: https://docs.weblate.org/en/latest/formats.html#translation-types-capabilities

Translators, please be patient with us! 🙇‍♀ We're getting there!

Keep calm, stay safe at home, and stretch your fingers. The translation is coming! 🤓

mrkvon commented 4 years ago

Currently, we have an issue with our i18next config in hosted.weblate. See #1373#discussion_r410402108.

We use natural translation keys, so . is not a special character, but end of a sentence. For this reason, our ./config/client/i18n.js says:

nsSeparator: false,
keySeparator: false,

however weblate assumes that we use . as key separator, so it does weird things like

    "Wait a moment": {
        "": {
            "": {
                "": "Warte einen Moment ..."
            }
        }
    },

I haven't found the related config on the website or in weblate documentation.

Maybe @orangesunny or somebody else may know whether this setting is available in weblate?

comradekingu commented 4 years ago

One might interject that the particular string in question should use an ellipsis (…) like what would be the duplicated one in public/locales/de/core.json

mrkvon commented 4 years ago

@comradekingu I agree, the string in question may need to use ellipsis. We still need a way to specify custom (or no) key and namespace separators. Even a single . (e.g. at the end of a sentence) breaks things without such configuration option.

nicksellen commented 4 years ago

Yes, would be great to know if weblate would (or already does) support that way of using keys.

Alternatively, we could switch away from natural translation keys... I wonder what the impact of switching at this point would be?

nijel commented 4 years ago

Unfortunately there is no setting for this, see https://github.com/translate/translate/issues/3976, https://github.com/translate/translate/issues/3857 and https://github.com/translate/translate/issues/3819.

nicksellen commented 4 years ago

I think we are not nesting though and https://github.com/translate/translate/issues/3976#issuecomment-614537041 says:

In case you are not using nesting at all, choose "JSON file" instead of "JSON nested structure file" in Weblate.

So maybe this works...

nicksellen commented 4 years ago

We have "i18next JSON file" currently though, not "JSON file" - I wonder what the implication of that is...

simison commented 3 years ago

I'll close this up as we have few languages already live at the site, and more cooking up.

Some follow-up tasks left in https://github.com/Trustroots/trustroots/issues/1485 — notably server-side and LTR.