Closed blairdrummond closed 3 years ago
The i18n of Rstudio can only be partial. It seems the menus/text are still in English, but the inside (the text, and the error messages) are able to be in French, if one changes the Environment variable to French
This is the code to add the French Locale and set it.
RUN echo "fr_CA.UTF-8 UTF-8" > /etc/locale.gen && \
locale-gen
# Configure environment
ENV CONDA_DIR=/opt/conda \
LC_ALL=fr_CA.UTF-8 \
LANG=fr_CA.UTF-8 \
LANGUAGE=fr_CA.UTF-8
The next step is to find how to detect the language, and in runtime set the environment variable. Current idea is to create a 'transparent layer' similar to remote-desktop dashboard but without UI that will check the browser language before it opens to the user.
To have R-studio in ‘French’ it needs LANG
to be setup with the correct locale.
The way it was decided is to “take the active language in the KF UI and automatically submit it as part of the "New Server" payload, and make the controller pass that locale as an env var (as you suggest) to all notebooks it launches. Then any container can find locale information at a known location and do with it as it pleases.” Which means to send it when creating a new notebook.
The equivalent of testing with docker run -e LANG=fr_CA.UTF-8 imageTag
, which overrides whatever value of that environment variable set in the docker file.
The changes therefore need to apply to multiple places, and some things need to be verified
Kubeflow-container to add the French locale (TODO: decide where to add locale: in the base file, or in r-studio file).
Jupyter-api to add the language detection (with a controllable UI). Whatever the setting in those will inject:
R-Studio (image) only needs LANG. The R-studio in remote desktop needs both LANG and LANGUAGE. Other applications might be impacted when changing the locales. (to be investigated)
From what I gathered, the way to have environment variables would be through the PodDefault (see https://www.kubeflow.org/docs/notebooks/setup/ step 12)
The short answer for this issue, is to have the Environment variable LANG set to the wanted language. For this to work, the locale for that language needs to also be available (ex: fr_CA.UTF-8).
This issue is split in two part,
Note: The locales are now added as part of the Dockerfile, see https://github.com/StatCan/kubeflow-containers/blob/d2b7863936af5e42ae2d4f342d1524887c1703db/docker-bits/0_Spark.Dockerfile#L8
Some other components in kubeflow-container may need to do similar things. i18n might be related to LANG, LANGUAGE and LC_ALL env variables.
Started work on internationalizing the menus and commands. Command names/labels are defined by XML in src/org/rstudio/studio/client/workbench/commands/Commands.cmd.xml
, which is then used by GWT in src/org/rstudio/core/Core.gwt.xml
to generate Java classes at compile(?) time. Not sure where these generated classes go yet, but should be able to modify this code to make the text getters use internationalization.
The commands defined in the Commands.cmd.xml
file are used via deferred binding to create the Java classes that actually use the commands in menus (for say the menu dropdown lists). An example of part of one of these xml files is:
<commands>
<cmd id="newPythonDoc" // <- Not internationalizable (command's id name. Never shown in UI, only used to access command)
menuLabel="_Python File" // <- internationalizable
desc="Create a new Python file"
rebindable="false"/> // <- Not internationalizable
...
</commands>
The generation of code for these classes is called for in ./src/gwt/src/org/rstudio/core/Core.gwt.xml
via:
<generate-with class="org.rstudio.core.rebind.command.CommandBundleGenerator" >
<when-type-assignable class="org.rstudio.core.client.command.CommandBundle"/>
</generate-with>
This process invokes CommandBundleGenerator.generate()
, which scans the xml to create java classes for everything defined.
To internationalize this, we must:
CommandBundleGenerator.generate()
to invoke i18n references to strings (via GWT's i18n tooling), or to directly use internationalized strings (eg: create generatedFile_en.java
, generatedFile_fr.java
, etc., based on some available internationalization data).
generate
, emitConstructor
and emitCommandInitializers
methods to include the required I18N imports (com.google.gwt.core.client.GWT
and org.rstudio.studio.client.workbench.commands.CommandConstants
), include the Constants in the class definition, etc. public interface CommandConstants extends Constants {
@DefaultStringValue("_Python File")
String newPythonDocMenuLabel(); // <- Where menuLabel is a property of newPython
@DefaultStringValue("Create a new Python script")
String newPythonDocDesc();
}
This should be automatically generated because if we add a new command to the xml file, we don't want to also have to add the command to the interface file (avoiding this is the whole point of managing these commands by the xml file). Note that the interface includes a default translation value (which I think is required?), which will be used if a corresponding _locale.properties
file is not available. These defaults should be set automatically from the xml file, again for the same reasons of reducing replication
_en.properties
file which will also be built off the xml_XXX.properties
files can then be generated via translation (using _en.properties
as a template?)I have successfully modified the generators to use i18n with a hard-coded interface file, but I'm not sure yet how best to automatically generate the constants interface or the properties files. Big questions are how to invoke GWT's generation mechanism properly and where they're placed once they're generated.
(note: this describes commands, but I think other things are similarly generated using this file (shortcuts, others))
As discussed here, JSON user prefs and state are built from a JSON schema file. This schema/resulting files is used in various locations in the UI (e.g. the Options dialog, the Command Palette) and needs to be translated as well.
The developer flow around changing these is to:
Maybe I should update this workflow to output files for multiple languages(?). We could build translatable files from the default language version, and maybe use a git diff
or similar to identify which keys are modified and need changing (so we don't completely delete the translated files every time). This could also be done in the cmd.xml
workflow.
Need to look into what content the .cpp
/.hpp
files contain. No idea how to internationalize those if I have to... But if just the java classes need it, that could be handled via resource bundle setting the name of the right xml file(?).
Easiest way forward appears to be handling building of the interface/property files for any metadata file translation (eg: XML/JSON files) the same way as the project currently uses the JSON file to build the actual user properties. We will add scripts that translate the XML/JSON files to interface/property files, using the English text in the XML/JSON files to seed the default text in the interface and English version of the property files. The property files can then be translated as needed. Typical development flow would then be:
xml-json_to_interface-property
scriptUpdate of progress/general work summary documented here: https://github.com/ca-scribner/rstudio/pull/1#issuecomment-819840878
11000 new lines and counting in the PR haha. Although a lot of that is automatically generated through scripts
Brief summary of progress:
.json
file and are described in metadata rather than code then real java code is generated from them by script. Scripts have been extended to support i18nenumReadable
in their defining metadata which defines the human readable text that goes along with an enumerator. These are useful for the Global Options menus where dropdowns are used (eg: Autosave mode enums of [backup
, nothing
] have readable versions of ["Backup unsaved changes", "Do nothing"]Next steps:
Big outstanding items:
Refactoring this into an epic tracked by Statcan/daaas/510. Closing this issue to claim the work already completed (fleshing out the task, doing some of the updates, etc). Future work will be tracked in separate issues
Look into Bilungualism options for RStudio