WDscholia / scholia

Wikidata-based scholarly profiles
https://scholia.toolforge.org
Other
219 stars 78 forks source link

Internationalization (i18n) and localization (l10N) #663

Open fnielsen opened 5 years ago

fnielsen commented 5 years ago

Internationalization (i18n) and localization (l10). Flask has flask-jsonlocale and Flask-Babel. Flask-Babel seems more mature.

It is unclear how efficient flask-jsonlocale is. It loads json translate files at each translation. There is also missing cache.

fnielsen commented 5 years ago

The issue with efficiency of flask-jsonlocale has been posed at https://github.com/urbanecm/flask-jsonlocale/issues/1

egonw commented 5 years ago

Does either of the two support .po files (ie. gettext)? If so, we can take advantage of https://translations.launchpad.net/

GerardMeijssen commented 5 years ago

Please check with Siebrand if flask-jsonlocale or Flask-Babel are supported at Translatewiki.net

Daniel-Mietchen commented 4 years ago

There are two e-scholarship applications to work on this task:

Daniel-Mietchen commented 3 years ago

The Kinyarwanda version is now available via https://github.com/jenzzly/scholia/tree/kinyarwanda .

Here is how it looks like for the Scholia homepage: Screenshot_2021-02-15 Scholia and here is the /publisher/Q233358 page (i.e. PLOS) — essentially no localization except for the navigation menu on the top: Screenshot_2021-02-15 Scholia(1)

Another thing that I am looking into is how to

Daniel-Mietchen commented 3 years ago

Also pinging @amire80 who might have ideas on how to address this and how to integrate it with existing internationalization workflows like translatewiki.

fnielsen commented 3 years ago

I made experiments on Ordia during a hackathon. No approach was fully satisfactory, because we have text coming in from SPARQL, both header and value. The approach that I think was best was the use of the MediaWiki approach which should be supported by translatewiki: https://github.com/wikimedia/jquery.i18n

amire80 commented 3 years ago

Hi! Thanks for the ping, @Daniel-Mietchen :)

I'm not familiar with your project, but pretty much all open source projects are welcome at translatewiki, especially if they are related to the Wikimedia world in any way. You can read more about getting started here: https://translatewiki.net/wiki/Translating:New_project

It looks like you are a web app, so perhaps you can use the https://github.com/wikimedia/banana-i18n localization library. It is used by several Wikimedia-related tools, it will be especially easy to integrate with translatewiki, and also easy for the current volunteer translators to start translating without having to learn any new syntax.

(Also, as general advice, avoid Gettext and PO if you can... they are pretty outdated and inconvenient. translatewiki supports it, but they are too strongly oriented at files and source code, and don't work as nicely with translating through a web interface.)

GerardMeijssen commented 3 years ago

Hoi, This is SO cool.. I take it you make use of translatewiki.net ... It is where all the internationalisations and localisations for the WMF happen.

Do you have a plan to get the attention of Scholia in other languages... Of particular importance are Russian and Chinese. Thanks, GerardM

On Mon, 15 Feb 2021 at 18:26, Daniel Mietchen notifications@github.com wrote:

The Kinyarwanda version is now available via https://github.com/jenzzly/scholia/tree/kinyarwanda .

Here is how it looks like for the Scholia homepage: [image: Screenshot_2021-02-15 Scholia] https://user-images.githubusercontent.com/465923/107976780-1b593a80-6f88-11eb-82c4-6be7e7e9521f.png and here is the /publisher/Q233358 page (i.e. PLOS) — essentially no localization except for the navigation menu on the top: [image: Screenshot_2021-02-15 Scholia(1)] https://user-images.githubusercontent.com/465923/107977023-6ffcb580-6f88-11eb-8d10-056fdbb5c9ed.png

Another thing that I am looking into is how to

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fnielsen/scholia/issues/663#issuecomment-779361857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKC5GAKREEMRGCCXCNGRNLS7FKMFANCNFSM4G6PSCOA .

jenzzly commented 3 years ago

Thank you for pointing us in the right direction, Has anyone already implemented something similar with banana-i18n? if so would love to see your repository?

fnielsen commented 3 years ago

https://translatewiki.net/

jdcaballerov commented 3 years ago

I'm starting to work in this feature using banana and will like to know how you plan to select the language ? We have several options:

What do you have in mind for language switching ?

GerardMeijssen commented 3 years ago

Hoi, I seriously wonder why. Translatewiki.net is where all Wikimedia projects are localised. Where there are some 300 languages being localised. What does any other project bring us? Thanks, GerardM

On Tue, 3 Aug 2021 at 18:12, jdcaballerov @.***> wrote:

I'm starting to work in this feature using banana and will like to know how you plan to select the language ? We have several options:

  • Cookies (not recommended): A cookie is set and changed by the user using a menu or some other mechanism.
  • Language domain extension: scholia.pt, scholia.pt
  • Language in subdomain (wiki uses this): en.scholia.toolforge.org, es.scholia.toolforge.org
  • Language in url path: scholia.toolgorge.org/es, scholia.toolforge.org/en

What do you have in mind for language switching ?

  • Side bar as wikipedia, a menu on top with a dropdown.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WDscholia/scholia/issues/663#issuecomment-891977410, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKC5GDGG2V42IAYEKZ6XLDT3AIPPANCNFSM4G6PSCOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

jdcaballerov commented 3 years ago

@GerardMeijssen @Daniel-Mietchen @fnielsen @egonw I am not suggesting to upload the language strings to a different service to get translated, but asking for the desired implementation details of the internationalization at the source code.

As far as I know translatewiki is a place where translators grab strings to translate, review etc. Screenshot from 2021-08-03 14-14-14

What I'm asking is about the desired implementation details. Let's say I'm gonna start supporting internationalization in the file scholia/app/templates/topic.html

Screenshot from 2021-08-03 14-16-02

As you can notice there are several strings that are fixed at the template the first being i.e <h1 id=h1>Topic</h1> to translate those strings to a selected language one needs to:

  1. have a mechanism to indicate the flask server what language needs to be used. (url, cookie, subdomain, etc)
  2. load a language file (this is what is translated by translatewiki.net community as I understand)
  3. replace the string in the selected language and locale.

To replace the string one possible solution in flask is using banana-i18n python package that is compatible with the format as per discussion. so that at the template level we will have something similar for the example at hand:

<h1 id=h1>{{ _('topic-header') }}</h1> that needs to be replaced for every string that is at the templates, js, etc.

Please correct me If I am misunderstanding the issue or I am missing something.

fnielsen commented 3 years ago

@GerardMeijssen I don't understand you post. The issue is not whether to use translatewiki.net. We plan to do that. The only thing that comes to my mind is a suggestion to use Wikidata lexemes.

fnielsen commented 3 years ago

(For my own record: When I experiment with I18N in https://github.com/fnielsen/ordia/ during a Wikimedia Hackathon, I created branches:

fnielsen commented 3 years ago

As one of the options that I thought was jquery.i18n. If I am correct that language occurs in Javascript rather than in Jinja2/Flask/Python.

But I suppose that this will not solve the issue of how we are going to mark which item should be translated. jQuery.i18 uses the id attribute as far as I can see.

fnielsen commented 3 years ago

Yet another problem is if or how we are going to change the SPARQL results. The SPARQL values returned can be changed via having language specification to the SPARQL. However, the column header might be more difficult.

fnielsen commented 3 years ago

I see there are quite some architectural choices that we have to make, - and that might also be the reason why I have not gone into the lions den before, and attempted some experiments in Ordia.

I suppose we can hook into the Javascript function that we use in connection with the dataTables library for handling translation for the SPARQL result part. The _( ) seems not to be able to handle that. So a Javascript and id attribution approach would probably be better?

fnielsen commented 3 years ago

If we go be jQuery.i18n/id then we already have a lot of ids set up on the pages that I hope would be useful. I see that the <h1 id="h1"> would be a problem, but e.g., "recently-published-works-header" could be used for the jQuery.i18n translation, - I hope. I am hoping that @jdcaballerov could experiment and tell us whether there is any showstoppers for that approach?

jdcaballerov commented 3 years ago

Internationalization PLAN under development

1) Chose a mechanism to select the language 2) Translate static content 3) Translate content from wikidata


1) We've been exploring the use of a cookie only set when the language changes, otherwise default to en. Once set we define a global in flask so the language code will be available at the template.

image

This will be then be available in every template by setting a js var. base.html

 var langCode = "{{g.lang_code}}"

We think this solution is the most flexible and complies with all of the requirements. Then

2) Static content on the templates

this requires a bunch of manual or semi-automated (regex to add data-i18n="message-key") work in many places

we can use jquery-i18n data attributes

this has 3 cases: a) the h1 b) divs with no id c) divs with id

a) the h1 replaced in scholia/app/templates/base.html

and is currently hardcoded as follows: image we can get the language from the global, test if there is content otherwise fallback to en

b) divs with no id this requires manual work (see h2) example from scholia/app/templates/authors.html

{% block page_content %}

<h1 id="h1">Authors</h1>

<div id="intro"></div>

<table class="table table-hover" id="list-of-authors"></table>

<h2>List of jointly authored works</h2>

<table class="table table-hover" id="list-of-jointly-authored-works"></table>

<h2>Number of works per publication year</h2>

c) divs with id use a regex to get the string and replace text adding the html attribute data-i18n="message-key-from-id-string"

3) this has to be addressed on a query by query basis adding the language, testing. we are starting with printer.html as suggested

Nikerabbit commented 3 years ago

I haven't read carefully all the comments in this task, so please ignore my comments if they are not relevant: Per my understanding jquery.i18n is deprecated in favor of banana-i18n (which doesn't depend on jquery but offers almost the same set of features).

As far as I can see, you have correctly identified the challenges, such as how to select language, how to synchronize language selection between front end and back end, differences between user interface and data/content translation.

For language selection we (Wikimedia) have jquery.uls. It's a bit old but it is built to handle hundreds of languages. This might be useful if you want to be able to choose between all the languages available in Wikidata even if you don't have the tool's user interface translated into that language.

A good practice is to sanitize translations like other untrusted input. This means escaping them appropriately in templates to avoid injection of HTML. Should something more complicated be needed, some kind of parsed mark-up layer can be used. I believe banana-i18n (and jquery.i18n) provide some supported for links and basic mark-up. Anyway, more importantly they provide support for handling plurals, so you can show 1 car, 2 cars, etc.. correctly in all languages where the rules are different from English.

Happy to help if you have questions related to translatewiki.net or i18n in general.