MichaelChirico / potools

Tools for working with translations in R
https://michaelchirico.github.io/potools/
58 stars 2 forks source link

Understanding how does the package work #33

Closed llrs closed 3 years ago

llrs commented 3 years ago

Cool package, and nice presentation at rstudioglobal!. I've been wanting to learn how to do provide my packages in another languages for some time.

I've tried it and got surprised by why stop("error here") is converted to stop(domain = NA, gettext("error-here")). On the README you say that stop functions are translatable. And from reading the docs it seems like the message could be automatically translated. Then why is gettext needed? Using tools::update_pkg_po(".") already created a file (pkg.pot) with error messages from stop calls I provided such as:

msgid ""
msgstr ""
"Project-Id-Version: senadoRES 0.1.0\n"
"POT-Creation-Date: 2021-02-04 00:31\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"

msgid "Number should be a numeric value above 0"
msgstr ""

My guess is that just providing a es_ES.po file and the one generated under inst/po/es_ES/LC_MESSAGES/R-pkg.mo, would be enough to translate the package to the desired language?

Also I didn't understand what languages on translate_package should be just a string to specify which language I want to translate to? In which format? Is "ES" enough for Spanish or should I provide something else like "es_ES" to differentiate it from "ar_ES"?

MichaelChirico commented 3 years ago

First, thanks so much and congrats on filing the first issue for the package! :tada: (besides ones I filed for myself)

stop("error here") indeed shouldn't be converted -- it sounds like a bug. Could you share your source? By e-mail is also fine if you'd prefer. I'm not reproducing:

cp -R tests/testthat/test_packages/r_msg /tmp/
cd /tmp
echo "stop('error here')" >> r_msg/R/foo.R
R
> potools::translate_package("r_msg") # (or with language='es')
# runs fine, doesn't prompt the need for `gettext`

My guess is that just providing a es_ES.po file and the one generated under inst/po/es_ES/LC_MESSAGES/R-pkg.mo, would be enough to translate the package to the desired language?

Yes, that sounds right.

You can test by running LANGUAGE=es_ES R to start R in Spanish and trying to trigger the error. You should be able to run gettext("Number should be a numeric value above 0", domain="R-pkg") to get the message directly too, though I'm a bit fuzzier there.

If that's not working, tunning tools::update_pkg_po(".") in your package directory can help ensure things are in the right place in case they're not yet.

Also I didn't understand what languages on translate_package should be just a string to specify which language I want to translate to? In which format? Is "ES" enough for Spanish or should I provide something else like "es_ES" to differentiate it from "ar_ES"?

This is a good question I should document a bit better. The relevant documentation is https://cran.r-project.org/doc/manuals/R-admin.html#Localization-of-messages

Translations are looked for by domain according to the currently specified language, as specifically as possible, so for example an Austrian (‘de_AT’) translation catalogue will be used in preference to a generic German one (‘de’) for an Austrian user. However, if a specific translation catalogue exists but does not contain a translation, the less specific catalogues are consulted. For example, R has catalogues for ‘en_GB’ that translate the Americanisms (e.g., ‘gray’) in the standard messages into English.39 Two other examples: there are catalogues for ‘es’, which is Spanish as written in Spain and these will by default also be used in Spanish-speaking Latin American countries, and also for ‘pt_BR’, which are used for Brazilian locales but not for locales specifying Portugal.

So I think the right approach is to make an es domain as a default and then, if you want to add some regional specificity, do so with es_XX domains like base R has done for American/British English.

It's not clear to me from the documentation whether, if a user is running in es_ES and the .mo file is es_AR (and there's no es or es_ES files), they'll be shown any translation (i.e., whether the system is "smart" enough to pull from a different Spanish domain when neither the current specific domain (es_ES) nor the generic domain (es) are available, but a "neighbor" specific domain is (es_AR).

MichaelChirico commented 3 years ago

OK I just tested it out on a basic package and it seems the system is "smart" enough to check "neighboring" domains.

  1. I made an es_AR.po file for the package, installed the translations & the package
  2. I tested the translations are working by running LANGUAGE=es_AR R and triggering the message
  3. I tested it works in es: LANGUAGE=es R, trigger the message
  4. I tested it works in es_ES: LANGUAGE=es_ES R, trigger the message
  5. I tested it doesn't work in other domains: LANGUAGE=zh_CN R, trigger the message, it comes in English
llrs commented 3 years ago

The comment about adding the domain to the error messages come from the first section of the README I got the impression that I should add the gettextf. I think moving some content from the README to a vignette will be very helpful.

The package I have provided translations to is at https://github.com/ropenspain/senadoRES/ see the PR https://github.com/rOpenSpain/senadoRES/pull/8. Although the translation commits are not clearly separated in the branches :(

When you say you set LANGUAGE to something, how do you do it? I tested it on my computer and made test it in other computers and didn't manage to change between languages: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=18055 Perhaps my problems are that I do not set it before starting R but via Sys.setenv?

Many thanks for demystifying the translation of the packages. Hope to provide more friendly error messages to our users.

MichaelChirico commented 3 years ago

this is a bit tough and seems like something that's hard to nail down depending on the operating system.

the R-admin manual makes some mention of caching issues like what you mention in your bug.

what I've found works for me (ubuntu and Mac) is to start a new R session with the LANGUAGE environment variable set as desired. we can do this temporarily by defining it in line:

LANGUAGE=es R

starts an interactive R session with the LANGUAGE set, that should work to get Spanish messages

llrs commented 3 years ago

Great! many thanks for the fast answers