Closed MortenHofft closed 3 years ago
We are able to do so by forking this repository and making pull requests though, no?
We could (not this repo though, but this monorepo https://github.com/gbif/gbif-web/tree/master/packages/react-components), but it probably isn't the best way to get people involved. We have used Crowdin for GBIF.org and other projects. It works well is my impression. And has a lot of nice features you wouldn't get with pull requests (such as an easy way to discuss and share context). And get machine suggestions. And get previews for messageformat with variables in them. And perhaps most importantly, no broken syntax and scary process for those less tech savy
Thinking about how best to reuse enumeration value translations (Preserved Specimen
etc), which are already translated into several languages for GBIF.org, and need to be consistent between these.
The rest of the HP UI (Search
, Basis of Record
) could be a second CrowdIn project — CrowdIn should make suggestions where the text is already translated for GBIF.org, but it will required the translator to review and accept those suggestions.
[We could add the hosted portal translation keys to the GBIF.org translation project in CrowdIn (it seems to be possible to connect two Git repositories), but I think this would be confusing to the existing GBIF.org translators.)
(For some of these enumerations, the best place to put translations would be the XML definitions of the vocabulary/thesaurus. That would make them reusable in the IPT and by anyone else.)
VertNet is looking for guidance about how best to help provide the original content in English that can go to Crowd-In.
The flow we have for GBIF.org - I imagine we do something similar
I've updated this issue to be translation flow specific. As we already have another issue for removing/replacing placeholder texts. The two tasks are independent and fixing placeholder texts should really be done before any translations starts
I created a pull request with English text for as many terms as I could identify.
Some thoughts on how this could/should be implemented. @thomasstjerne and @MattBlissett do you have any thoughts on how this should be done?
I can see these options:
Both approaches could work, but the second allow us to load them as needed. This could matter for performance if the site has many translations. (say one translation is 50kb and then load 10 of them)
Common to both is that it loads a single file. Our translation library expects a single json with all strings so loading them as one seems the simplest and should perform better as well.
I too think it makes sense to have this project alongside the "gbif.org -ui" Crowdin project (and its english counter part). The intention is that the two will align over time, and secondly all the enumerations are already in that project.
I find it easier to work with multiple files than one huge. And secondly the existing enum translations from the "gbif.org" project is individual files. So we need a to stitch them together. I imagine a build step doing that.
For gbif.org we have everything as part of the translation file. E.g. all languages and all GrSciColl collection disciplines (just to mention 2 enums that are rarely used).
Instead of including everything, we could also select those that we use most frequently and load the rest from a new endpoint. E.g. /translation/languages/abk
would return Abkhazian
. Similar to how we load dataset titles and scientific names.
Or we could explore the option to load all the values for an enum, but not do so until they enum is used. So the GrSciColl collection disciplines will not be loaded until you visist a GrSciColl institution page.
I'm honestly not sure how to do so technically or if it is a huge task, but I like the idea to keep the core translation file smaller by focusing on the UI elements and then load enums for data presentation asynchrounsly. I imagine that this is a fairly simple thing to do.
It is easy to get worried about band with and loading a bloated translation file too much.
It might be worth remembering that the "giant" translation file for gbif.org is 150kb unzipped and 50kb zipped. The header image alone on gbif.org is 200kb and we load 1.4 mb images for our home page alone. Individual map tiles are as big as 152kb. If we can avoid blocking the first render, then perhaps we shouldn't worry about a 10kb vs 50kb translation file.
This has progressed and is now in the staging environment. The current state is:
fr: https://some.url/translations/fr.json?v=1230987243
. That means that the small mapping file will be loaded on all requests, but the large translation file can be cached. And we can do so without breaking the cache for the library. The downside is that all users must fetch the translationMap when the page is loaded. The benefit is that the library and the translations can be cached individually (meaning the french users do not need to fetch everything again just because a german translation is updated)._data/languages.yml
.We still need to figure out what the best flow is for translators and how we can support/guide the effort. I will update the issue with more information when that is in place.
@daiesco you've asked about this recently. You will see that your site now appears partly translated.
Thank you @MortenHofft, we will be moving forward with the translations in the Crowdin project.
I recently translated most of the relevant terms into Dutch, but I have however not started translating all country names into Dutch. Surely there must be a faster and less typo-prone way to do that? E.g. use ISO codes to import the translations? For example: https://nl.wikipedia.org/wiki/ISO_3166-1
@langeveldNMR yes. You can upload translations to Crowdin. If you find a source and format it, then there is a button to do so. https://crowdin.com/project/gbif-portal/nl You can see what format they should be uploaded in if you download the file first.
It is difficult for me to evaluate the quality of the data sources in various languages so I leave that to the translator
Please also note that Wikipedia uses short/informal names like "Bolivia" and "Noord-Korea" rather than "Bolivia, Plurinationale Staat" (or however that would be in Dutch) and "Korea, Democratische Volksrepubliek".
Many country names are in the Crowdin global dictionary, so you could also click the "Save" icon next to the suggestion and go through fairly quickly, one click for each country. (Still not perfect though; e.g. there are three different suggestions for Kyrgyzstan.)
Thanks for the information. I used the formal Dutch names kept at https://namen.taalunie.org/landen and matched those with the country codes provided in the json file. Easy and rather quick to do like this.
We have decided on a process for translations and implemented it. I will close this issue. There will no doubt be issues related to translations in the future. Feel free to open a new issue.
I recently noted that the Dutch translations I prepared are already available through https://hp-nhm-rotterdam.gbif-staging.org/nl/data.html but are not yet deployed on the portal itself https://specimens.hetnatuurhistorisch.nl/nl/data.html (I did do a new release recently, but nothing changed). Should they first be approved on crowdin or is some other action necessary?
I have to merge it into the web project and deploy it to master. So it isn't you, but me that should monitor translations more carefully. I will redeploy it now so it should be available in a 5 minutes or so
Just to make sure I am not missing anything: I did a new release recently, but nothing changed. On crowdin https://crowdin.com/project/gbif-portal/nl the translations are still blue (not green). Please let me know if there is any other action required from our side.
I cannot see any recent changes in the Dutch translations, but I might have missed it. Could you give me an example translation that I can check? And could you specify what "nothing changed" mean: is it in staging but not in prod or does it not show anywhere?
I have not added any new translations recently, and they are still showing in staging but not in prod.
Thanks @langeveldNMR - could you please provide an example. That will make it easier for me to find the cause.
btw there are 2 types of translations:
It concerns the first type you list. Attached are four screenshots.
In the Dutch (NL) staging environment the data widgets are translated. However, they are not in the Dutch production environment, where they are identical to the English production environment.
ahhhh - there are no translations at all. I understood your message as "a few of the latest translations are missing". I see that I haven't added NL as a supported language in prod. That is done now. You translations are visible in your prod environment. Sorry about that - I misunderstood the problem.
There is currently no process set up to edit labels (english or other languages).
~Secondly most of the texts that are there in the file is placeholder like "Fill in some content here" or even worse is missing and hence just show as
filter.hostKey.description
~ removing placeholder values has an issue of its own https://github.com/gbif/hosted-portals/issues/134