Open bozana opened 7 months ago
@bozana I think the languages.csv
might be the same that you can extract from ResourceBundle::getLocales('')
.
I think it makes sense, I raised this concern when the locales were merged, because we would lose the "country" of the submission (@marcbria).
@jonasraoni I do not think it is the same, see this comparison: https://docs.google.com/spreadsheets/d/1EFs2cr7Tw2lwR_JVIQHcnqXg91tVdJna8NMSBdLVW_Q/edit?usp=sharing
For example Belarussian in Weblate is listed with script variants and from ResourceBundle::getLocales('')
script variants are missing. On the other hand from the Weblate list we are missing things like es_ES
(only es mentioned) but in the ResourceBundle::getLocales('')
list it is included.
The spreadsheet is good to take a decision! :) I personally support using an official variant, even if it's missing one thing or another, as it's more likely to fit external systems.
Here is a comparison of ResourceBundle::getLocales('')
and Weblate languages.csv
. The differences where far bigger that I expected. https://craft-test.online/languageComparison/
Also if this is easier to read https://www.diffchecker.com/0HTZe7UH/
I think this comparison https://craft-test.online/languageComparison/comparison3.php gives a fairly good idea of the differences between ResourceBundle::getLocales('')
and Weblate languages.csv
:
ResourceBundle::getLocales('')
is missing a lot of locales (462) which do not even have a close alternative. This applies especially to three letter locales. languages.csv
are missing even more locales (528) BUT most of these have an alternative.
fi
exists but fi_FI
does not.dav
, agq
, sbp
, yav
es
and fr
lack the country specific locale for es_ES
and fr_FR
and these are maybe cases where it would have sense to have the ability to specify a country variant.For me this is a clear indiciation that the Weblate list would work better here although I do understand @jonasraoni comment about using an official variant. The important thing here is that the Weblate list locales are formed according to a standard.
Ideally we could try to include the missing country specific locales OR consider hosting an own languages.csv list.
First, thank you AJ for taking the time to go through all this and give us easy to digest summaries. Your patience and generosity with your time is commendable.
We've discussed it in various places, but I'll put my position on record in this thread. In short, I am convinced that whatever standard is chosen, we must guarantee three things:
The reasons? To promote equality between the different languages, to avoid representing them from a colonialist point of view, to encourage and facilitate the task of translators and to have total autonomy to decide, as a project, on a topic as relevant as the localisation of PKP applications.
That said, we could use the weblate list as a starting point and create our own with the changes we consider appropriate?
In this sense, I suggest eliminating any reference to codes without region and (at least in the interface) I would always use the regionalised code (es -> es_ES) instead.
The proposal I am making should be accompanied by developments in line with this:
Just to underline:
Apologies. I was catching up on this thread (whose title talks about UI) and I forgot to go into the metadata issue.
Although I don't really have a clear opinion on this part. Short answer: In metadata we should allow both?
I reason out loud and if I say something stupid, you let me know.
As most upstreams do not take regionality into account, I suspect that for metadata it is not so important to define it and, if the admin so wishes (I think it is something the Editor should not be able to change), we should allow languages (i.e. "fr" without region code).
But I understand that if some admin considers that it is relevant for the journal to indicate the region, from a perspective respectful of linguistic diversity, the tool should allow the region to be indicated?
In any case, I wouldn't ask about this with every submission and it should be a global parameter, to be defined once during the installation (or to be modified later by the admin... but VERY carefully).
In this sense, the code-lang selector demo you made some months ago (accompanied with a little explanation about the real impact of the decison they are making) sounds like a great solution to me, as far as it let you stop in the detail you require.
Does it make sense to you?
I am now starting to work on this issue. We decided to use Weblate locales, for the submission and metadata locales (s. issue ...) as well as for the rest of the system. As far as I can see we have used sokil library to get the translated locale display names, as well as for conversion between different ISO codes. I think that now we can use the PHP intl functions (e.g. locale_get_display_name) to get the translated locale display names. So no need to use sokil for this any more. However, we will still use sokil library to convert locales into different ISO codes (mostly used in third party services). Tagging here @jonasraoni for his oppinion, because he worked on the current Locale* implementation, and maybe sees/knows what I haven't seen yet :-)
In the project CRAFT OA, in this issue https://github.com/pkp/pkp-lib/issues/9425, the submission locales will be separated from the UI locales. The decision was made to take Weblate locales (s. https://github.com/WeblateOrg/language-data/blob/main/languages.csv) for submission locales and to also align the UI locales.
CRAFT OA project has identified the following mapping between the current UI locales and Weblate locales: 'be@cyrillic' => 'be', 'bs' => 'bs_Latn', 'fr_FR' => 'fr', 'nb' => 'nb_NO', 'sr@cyrillic' => 'sr_Cyrl', 'sr@latin' => 'sr_Latn', 'uz@cyrillic' => 'uz', 'uz@latin' => 'uz_Latn', 'zh_CN' => 'zh_Hans',