pkp / pkp-lib

The library used by PKP's applications OJS, OMP and OPS, open source software for scholarly publishing.
https://pkp.sfu.ca
GNU General Public License v3.0
297 stars 442 forks source link

Align UI locales with Weblate locales #9707

Open bozana opened 7 months ago

bozana commented 7 months ago

In the project CRAFT OA, in this issue https://github.com/pkp/pkp-lib/issues/9425, the submission locales will be separated from the UI locales. The decision was made to take Weblate locales (s. https://github.com/WeblateOrg/language-data/blob/main/languages.csv) for submission locales and to also align the UI locales.

CRAFT OA project has identified the following mapping between the current UI locales and Weblate locales: 'be@cyrillic' => 'be', 'bs' => 'bs_Latn', 'fr_FR' => 'fr', 'nb' => 'nb_NO', 'sr@cyrillic' => 'sr_Cyrl', 'sr@latin' => 'sr_Latn', 'uz@cyrillic' => 'uz', 'uz@latin' => 'uz_Latn', 'zh_CN' => 'zh_Hans',

jonasraoni commented 7 months ago

@bozana I think the languages.csv might be the same that you can extract from ResourceBundle::getLocales('').

I think it makes sense, I raised this concern when the locales were merged, because we would lose the "country" of the submission (@marcbria).

ajnyga commented 7 months ago

@jonasraoni I do not think it is the same, see this comparison: https://docs.google.com/spreadsheets/d/1EFs2cr7Tw2lwR_JVIQHcnqXg91tVdJna8NMSBdLVW_Q/edit?usp=sharing

For example Belarussian in Weblate is listed with script variants and from ResourceBundle::getLocales('') script variants are missing. On the other hand from the Weblate list we are missing things like es_ES (only es mentioned) but in the ResourceBundle::getLocales('') list it is included.

jonasraoni commented 7 months ago

The spreadsheet is good to take a decision! :) I personally support using an official variant, even if it's missing one thing or another, as it's more likely to fit external systems.

ajnyga commented 6 months ago

Here is a comparison of ResourceBundle::getLocales('') and Weblate languages.csv. The differences where far bigger that I expected. https://craft-test.online/languageComparison/

Also if this is easier to read https://www.diffchecker.com/0HTZe7UH/

ajnyga commented 6 months ago

I think this comparison https://craft-test.online/languageComparison/comparison3.php gives a fairly good idea of the differences between ResourceBundle::getLocales('') and Weblate languages.csv:

For me this is a clear indiciation that the Weblate list would work better here although I do understand @jonasraoni comment about using an official variant. The important thing here is that the Weblate list locales are formed according to a standard.

Ideally we could try to include the missing country specific locales OR consider hosting an own languages.csv list.

marcbria commented 6 months ago

First, thank you AJ for taking the time to go through all this and give us easy to digest summaries. Your patience and generosity with your time is commendable.

We've discussed it in various places, but I'll put my position on record in this thread. In short, I am convinced that whatever standard is chosen, we must guarantee three things:

The reasons? To promote equality between the different languages, to avoid representing them from a colonialist point of view, to encourage and facilitate the task of translators and to have total autonomy to decide, as a project, on a topic as relevant as the localisation of PKP applications.

That said, we could use the weblate list as a starting point and create our own with the changes we consider appropriate?

In this sense, I suggest eliminating any reference to codes without region and (at least in the interface) I would always use the regionalised code (es -> es_ES) instead.

The proposal I am making should be accompanied by developments in line with this:

ajnyga commented 6 months ago

Just to underline:

marcbria commented 6 months ago

Apologies. I was catching up on this thread (whose title talks about UI) and I forgot to go into the metadata issue.

Although I don't really have a clear opinion on this part. Short answer: In metadata we should allow both?

I reason out loud and if I say something stupid, you let me know.

As most upstreams do not take regionality into account, I suspect that for metadata it is not so important to define it and, if the admin so wishes (I think it is something the Editor should not be able to change), we should allow languages (i.e. "fr" without region code).

But I understand that if some admin considers that it is relevant for the journal to indicate the region, from a perspective respectful of linguistic diversity, the tool should allow the region to be indicated?

In any case, I wouldn't ask about this with every submission and it should be a global parameter, to be defined once during the installation (or to be modified later by the admin... but VERY carefully).

In this sense, the code-lang selector demo you made some months ago (accompanied with a little explanation about the real impact of the decison they are making) sounds like a great solution to me, as far as it let you stop in the detail you require.

Does it make sense to you?

bozana commented 2 days ago

I am now starting to work on this issue. We decided to use Weblate locales, for the submission and metadata locales (s. issue ...) as well as for the rest of the system. As far as I can see we have used sokil library to get the translated locale display names, as well as for conversion between different ISO codes. I think that now we can use the PHP intl functions (e.g. locale_get_display_name) to get the translated locale display names. So no need to use sokil for this any more. However, we will still use sokil library to convert locales into different ISO codes (mostly used in third party services). Tagging here @jonasraoni for his oppinion, because he worked on the current Locale* implementation, and maybe sees/knows what I haven't seen yet :-)