sebsauvage / Shaarli

The personal, minimalist, super-fast, no-database delicious clone.
http://sebsauvage.net/wiki/doku.php?id=php:shaarli
Other
680 stars 399 forks source link

Going multilingual #18

Open ghost opened 11 years ago

ghost commented 11 years ago

I guess the procedure would be: 1) Have a languages/ folder with "mylanguage.php" files and make Shaarli use them 2) Uses a translation platform so even people not familiar with github nor coding could go translatin if they will. Of course it's optional but I found it usefull when translating flattr for example. I don't have any favorite method to maintain a software translations in many languages, so I have none to suggest here, but at least it's on the list.

sebsauvage commented 11 years ago

Right. Some people suggested .po files, even hosted on Launchpad, but I have not decided on a translation system.

ghost commented 11 years ago

In my experience translations on launchpad are a pain to deal with (no easy access to final .po/.pot files, slow interface). Maybe you should give transifex a try. Here is a project I use to work on, just register and see for yourself if it fits your needs. A self-hosted alternative would be weblate

tontof commented 11 years ago

I've also searched for my scripts how I can use .po/.mo files but as I wanted to keep a single file for the end user it was difficult to "compile" it. I found this to convert .po file into binary .mo file in php which can be helpful : https://github.com/josscrowcroft/php.mo/blob/master/php-mo.php

For now, I think I will use simple arrays as explained here which seems to be the simplest for my needs : http://commons.oreilly.com/wiki/index.php/PHP_Cookbook/Internationalization_and_Localization#Localizing_Text_Messages

e2jk commented 11 years ago

Please, please, don't reinvent the wheel and use some kind of array system. This will prevent you from using translation tools such as Transifex or Launchpad, which will make life more difficult for your translators (most of which are not necessary developers). Using standard formats (such as gettext's .po) makes is easier for non-developers to contribute.

I collaborate on both Transifex and Launchpad. Transifex is more user friendly, but one nice feature from Launchpad is that translations from other projects are also available, which makes it faster to translate common senteces words. Although I'm not a huge fan of Launchpad, I want to infirm what nodiscc mentioned: exporting .po and .pot files is absolutely possible, you click one link, and 2 minutes later you have a tarball in your email with everything you need. This help page explains the process: https://help.launchpad.net/Translations/YourProject/Exports#Requesting_a_one-off_download You can also have Launchpad automatically commit changes to a bzr branch, which is helpful if you ever need to revert vandalism... Info on the top of the previous link.

I sent a patch in September 2011 (based on Shaarli 0.0.9 beta!) that added gettext support (and translated just one string). I assume the patch won't apply cleanly on version 40, but the principle hasn't changed. My patch came with a rather long email, so I'll put that in a separate post below. Hope this can be helpful!

e2jk commented 11 years ago

Sorry for the mail in French, code below.

date: 19 septembre 2011 20:17 objet: Internationalisation de Shaarli

Salut Seb,

J'ai déjà posté un commentaire sur la page de Shaarli sur ton wiki [0] a propos de mettre Shaarli sous contrôle de version. La je suis en train de jouer un peu avec ton code pour l'internationaliser, je suis au point ou j'ai rendu les deux phrases au bas traduisables: "The personal, minimalist, super-fast, no-database delicious clone. By sebsauvage.net" et "Who gives a shit that this page was generated in %s seconds?" en guise de test, tout fonctionne bien. Pour l'instant j'ai "hard-codé" la langue en français, j'ai commencé à bidouiller un truc pour détecter la langue du visiteur automatiquement (basé sur un exemple trouvé sur le net), mais après il faut réconcilier ça avec la liste des différentes langues disponibles. Bref, je me suis pas encore foulé ;)

La raison pour laquelle je t'envoie ce mail est pour te demander ton opinion sur la marche que tu compte suivre. D’après moi tu as 2 possibilités pour internationaliser Shaarli:

Si tu veux utiliser GNU Gettext, ca donnerai un layout des fichiers comme ca:

./index.php ./i18n ./------/shaarli.pot ./------/fr.po ./------/fr/ ./------/--/LC_MESSAGES ./------/--/------------------------/messages.mo

Donc pour resumer, à côté du script index.php il y aurait un dossier i18n, dans lequel il y aurait le template de traduction (le fichier .pot), un fichier par langue (le fichier .po, source des traductions), plus un dossier xx/LC_MESSAGES par langue, qui contiendrait un fichier messages.mo qui est le resultat compilé du fichier source .po.

Dans ton repository (le code source), seuls les fichiers .pot et .po seront necessaires. Ils ne seront pas necessaires dans le .zip que tu publies pour chaque nouvelle release, la il est suffisant de mettre les xx/LC_MESSAGES/messages.mo

Cette approche voudrait donc dire que tes utilisateurs devront uploader le dossier ./i18n (tu peux le renommer si tu veux) à côté de l'index.php. Tu n'auras donc plus un seul fichier, mais d'apres moi tu as fort a gagner a utiliser les outils standard plutôt que d'inventer ton propre format...

Pour te donner une idée, je te joins un zip avec mes modifications. Les 2 strings qui sont traduisibles (c'est un mot ça?) te donnent aussi l'exemple comment faire pour un string qui contient une variable. Si tu ajoutes de nouveaux strings a traduire, execute cette commande pour mettre a jour ton fichier .pot (le template) et remplacer le fichier source de traduction en Francais (attention, tu perdras toutes les traductions a ce moment): xgettext --default-domain=shaarli -o i18n/shaarli.pot index.php && cp i18n/shaarli.pot i18n/fr.po

Ouvres le fichier ./i18n/fr.po, traduis les strings. Une fois fini la traduction, compile le fichier .mo: msgfmt -o i18n/fr/LC_MESSAGES/messages.mo i18n/fr.po

Recharge la page, et tu verras ton nouveau string traduit!

Si tu decides d'aller avec Gettext, si tu veux je pourrais te developper un petit script qui met a jour le template et les fichiers source tout en gardant ce qui a deja ete traduit, je l'ai deja fait pour un projet Django a moi, en Python [1]. Il faudra l'adapter pour être plus specifique a ton projet. En parlant de ça, j'ai ouvert une question sur Launchpad pour demander si d'autres ont déjà eu a developper un tel script.

Mais bon, je ne veux pas aller trop loin, au cas ou tu ne voudrais pas avoir plus d'un fichier ;) Je t'ai attache un patch pour que tu puisses mieux voir ce que j'ai ajouté/changé (j'ai pu faire ça grace a la branche bzr ;) )

Bon allez, ce mail est deja mille fois trop long! Dis moi ce que tu en penses. ++ +Emilien

P.S.: Tu m'excuseras, mais j'en avais plein le cul de corriger les accents a la 1/2 du mail (j'utilise un clavier QWERTY)...

[0] http://sebsauvage.net/wiki/doku.php?id=php:shaarli#comment_e0f627dfc9884f4a264b2ebcaf50c8d3 [1] http://bazaar.launchpad.net/~itt-devs/issuetrackertracker/main/view/head:/po/translations-manager.py

Unfortunately I can't attach my patch and tarball with translation, so I've pasted it below:

EDIT: pasting the patch file didn't look good... EDIT 2: between tripple backticks does the trick:

emilien@bohr:~/devel/shaarli$ bzr cdiff
=== modified file 'index.php'
--- index.php   2011-09-16 22:24:10 +0000
+++ index.php   2011-09-19 17:50:37 +0000
@@ -1,4 +1,5 @@
 <?php
+
 // Shaarli 0.0.9 beta - Shaare your links...
 // The personal, minimalist, super-fast, no-database delicious clone. By sebsauvage.net
 // http://sebsauvage.net/wiki/doku.php?id=php:shaarli
@@ -18,6 +19,41 @@
 checkphpversion();

 // -----------------------------------------------------------------------------------------------
+// Translation stuff
+//Based on http://www.thefutureoftheweb.com/blog/use-accept-language-header
+$langs = array();
+
+if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
+    // break up string into pieces (languages and q factors)
+    preg_match_all('/([a-z]{1,8}(-[a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', $_SERVER['HTTP_ACCEPT_LANGUAGE'], $lang_parse);
+    if (count($lang_parse[1])) {
+        // create a list like "en" => 0.8
+        $langs = array_combine($lang_parse[1], $lang_parse[4]);
+        // set default to 1 for any without q factor
+        foreach ($langs as $lang => $val) {
+            if ($val === '') $langs[$lang] = 1;
+        }
+        // sort list based on value    
+        arsort($langs, SORT_NUMERIC);
+    }
+}
+
+echo "<pre>";
+var_dump($langs);
+echo "</pre>";
+
+foreach ($langs as $lang => $val) {
+    // Detect for which of the wanted languages we have a translation
+}
+
+// Not using setlocale, see http://nl.php.net/manual/en/ref.gettext.php#68853
+$locale = 'fr'; // Pretend this came from the Accept-Language header
+$locale_dir = './i18n'; // your .po and .mo files should be at $locale_dir/$locale/LC_MESSAGES/messages.{po,mo}
+putenv("LANGUAGE=$locale");
+bindtextdomain('messages', $locale_dir);
+textdomain('messages');
+
+// -----------------------------------------------------------------------------------------------
 // Program config (touch at your own risks !)
 error_reporting(E_ALL^E_WARNING);  // See all error except warnings.
 //error_reporting(-1); // See all errors (for debugging only)
@@ -1078,7 +1114,14 @@

 HTML;
     $exectime = round(microtime(true)-$STARTTIME,4);
-    echo '<div id="footer"><b><a href="http://sebsauvage.net/wiki/doku.php?id=php:shaarli">Shaarli '.shaarli_version.'</a></b> - The personal, minimalist, super-fast, no-database delicious clone. By sebsauvage.net<br>Who gives a shit that this page was generated in '.$exectime.' seconds&nbsp;?</div>';
+    //echo '<div id="footer"><b><a href="http://sebsauvage.net/wiki/doku.php?id=php:shaarli">Shaarli '.shaarli_version.'</a></b> - The personal, minimalist, super-fast, no-database delicious clone. By sebsauvage.net<br>Who gives a shit that this page was generated in '.$exectime.' seconds&nbsp;?</div>';
+
+    echo '<div id="footer"><b><a href="http://sebsauvage.net/wiki/doku.php?id=php:shaarli">Shaarli '.shaarli_version.'</a></b> - ';
+    echo _("The personal, minimalist, super-fast, no-database delicious clone. By sebsauvage.net");
+    echo '<br>';
+    echo sprintf(_("Who gives a shit that this page was generated in %s seconds?"), $exectime);
+    echo '</div>';
+
     if (isLoggedIn()) echo '<script language="JavaScript">function confirmDeleteLink() { var agree=confirm("Are you sure you want to delete this link ?"); if (agree) return true ; else return false ; }</script>';
     echo '</body></html>';
 }
@@ -1125,4 +1168,4 @@
 if (!isset($_SESSION['LINKS_PER_PAGE'])) $_SESSION['LINKS_PER_PAGE']=LINKS_PER_PAGE;
 if (startswith($_SERVER["QUERY_STRING"],'do=rss')) { showRSS(); exit; }
 renderPage();
-?>
\ No newline at end of file
+?>
e2jk commented 11 years ago

Looks like pasting the diff file uncovered some hidden "feature" in GitHub

PrtScr capture

Just contact me and I'll forward the original email, with patch and sample .pot/,po/.mo files

Leomaradan commented 11 years ago

Pour la traduction, je me suis développé un petit module (vue son importance, il n'a pas encore de dépôt. il est utilisé ici par exemple https://github.com/Leomaradan/URLess)

une class i18n, avec une fonction getText, et un fichier .ini par langue / région.

ça gère les langue type IETF, par exemple fr-CH pour le suisse romand. Si le fichier fr-ch.ini n'existe pas (le code de langue, mais en minuscule), ça chercher le fichier fr.ini. Enfin, si fr.ini n'est pas trouvé, ça utilisera une langue par défaut, à configurer (et si le fichier de langue n'existe pas... ben tant-pis :p)

L'avantage du .ini, c'est la facilitée de manipulation. Il est possible de donner le fichier à traduire même à des non-codeurs, pas besoin d'outil spécifique pour le générer, etc

A voir, mais ça me semble une solution relativement simple à utiliser, qui pourrait aller dans la veine de Shaarli

e2jk commented 11 years ago

This is exactly what I mean by reinventing the wheel... Existing standards have been pounded upon by a large number of people much smarter than we are ;)

Just as an example: does this .ini solution handles for instance plurals? ("Il y a un lien avec le tag blabla" versus "Il y a 5 liens avec le tag blabla")

Also, if you use an online platform for translation, your non-coders don't even need to download a file, find it on their machine, edit it, send it again via email, etc. With an online platform you're opening yourself to "drive-by" contributions, one of the strengths of open development...

Using standard/existing tools is the same reason why Shaarli is using a templating engine, PHP's session/cookie handling, etc. Of course, everything could be written from scratch, but then you have to fish out the bugs, maintain the code in the long run, etc.

Leomaradan commented 11 years ago

IMO, online platform is a bad idea, it makes Shaarli dependent of external tools.

Actually, Shaarli has few codes, with only vital/simple feature - And it's why it is good. For example, the template system, RainTPL is simple, not as complete than Smarty (for example), but makes his work.

A complexe i18n system is needed ? or a basic solution, with literal translation, can be used ?

My .ini solution provide only direct translation (also with variables), but can be sufficient.

(for your example, use parentheses is possible: "Il y a X lien(s) avec le tag blabla" ;) )

e2jk commented 11 years ago

An online tool is just an extra, easier-to-use platform, not a dependency. You can still send the .po file to somebody else, it's just plain text (one line in English, the following line in the other language), which you can edit in special programs (like poEdit), or even just with a plain-old text editor. So with a standard .po file, you have all the features of the .ini approach, with the benefit of not having to create your internationalization framework from scratch...

Look at my patch a few posts up: the first few lines are language detection (which you'd have to do as well with .ini), and the real "framework" code is just 3 lines:

putenv("LANGUAGE=$locale");
bindtextdomain('messages', $locale_dir);
textdomain('messages');

Then you wrap your strings with _( and } (also needed with .ini), and you're done.

e2jk commented 11 years ago

Just to make sure we're all on the same page when discussing .po vs .ini, here is an extract of an actual .po file:

#: ../nautilus_image_manipulator/NautilusImageManipulatorDialog.py:109
msgid "Please enter some text to append to the filename."
msgstr "Veuillez entrer le texte à ajouter au nom du fichier."

Hopefully this makes it clear that .po files are also editable in a normal text editor, for contributors that would not want to use online services. For those that want, this allows for direct import in any number of online editors.

Leomaradan commented 11 years ago

RainTPL 3 implements a basic internationalization system, with dictionary.

http://www.raintpl.com/Forum/Development-Forum/Rain-TPL-3/?t=139

nodiscc commented 9 years ago

The https://github.com/shaarli/Shaarli community fork will likely use gettext as an internationalization/localization system.

@Leomaradan The current Shaarli codebase doesn't work against RainTPL 3 unfortunately; Pull Request welcome!