pulsar-edit / pulsar

A Community-led Hyper-Hackable Text Editor
https://pulsar-edit.dev
Other
3.33k stars 140 forks source link

[core] i18n: Fresh Start #1074

Open confused-Techie opened 3 months ago

confused-Techie commented 3 months ago

This PR builds off of the fantastic work by @meadowsys in #715 to attempt getting Pulsar up and running with translation support.

While some aspects of this PR are inspired or directly borrowed from Meadows work, a lot of what was helpful were the initial research stages done, and being able to implement many of those ideas directly.

But this PR now represents a completely functional translatable Pulsar and community packages implementation. But to dive a little deeper, it's important to mention each individual way of text appearing in Pulsar and how we translate it:

Getting Started

To implement translations in a community project (or Pulsar for that matter) is to simply add a locales folder in the root of the project. Within this folder should be a collection of files for your translations named like so: package-name.locale.json|cson.

When your project is initialized, just like for menus, this file will be read automatically. In the case of Pulsar the i18n.initialize() function reads Pulsar's locales file automatically.

As this file is read it'll take a look at the locale for each file and find the ones that may apply to the user. Because during load we have no awareness of the completeness of each file, we want to load as many as possible, while still making sure not to do wasted work. So we look at all possible languages the user may load at some point, and include any locales that are on that list.

From here all files loaded are available via i18n.strings (although this should not be accessed directly). This key value store is then used to match all translations. Accessible via a keypath like pulsar.context-menu.core:undo.

Translating Strings

For each string that's being translated we have the full support of the ICU Message Syntax, which allows plurals, replacements and much more, this is all provided by Intl-MessageFormat.

Keep in mind that for items that need additional properties passed to them, such as replacement values, these can only be used when translating via the i18n API, as extra properties cannot be passed via LocaleLabels.

Methods of Translation

There's a few different ways that get a string translated, so lets take a look at all of them.

LocaleLabel

In some cases, it's impossible to access the i18n API to translate a string, such as files in your menus directory. Since these files are cson|json files, they cannot run JS code. To translate these items we use what I'm dubbing a LocaleLabel which is simple a keyPath that correlates to a string accessible to the i18n API, such as one stored in your locales directory, that is surrounded by %.

For example, lets say the contents of ./locales/pulsar.en.cson looked like:

'pulsar': {
  'context-menu': {
    'core:undo': 'Undo'
  }
}

And I wanted this string to appear from ./menus/win32.cson:

'context-menu': 
  'atom-text-editor, .overlayer': [
    {label: '%pulsar.context-menu.core:undo%', command: 'core:undo'}
  ]

The above will successfully translate the label of this context menu item when it appears for the user.

The LocaleLabel method of translation is supported in:

API

But for all other cases of translation, when we have access to the i18n API we have more freedom.

Lets say we have ./locales/pulsar.en.cson:

'pulsar': {
  'ui': {
    'myString': 'Hello World'
  }
}

The simplest way to translate this would be:

const str = atom.i18n.t("pulsar.ui.myString");

And if we needed to pass any replacements or extra parameters to Intl-MessageFormat we would do that like:

const str = atom.i18n.t("pulsar.ui.myString", opts);

But lets say we wanted easy access to our namespace, or our package's namespace.

const t = atom.i18n.getT("pulsar");

const str = t.t("ui.myString");

This saves us from having to type the full API dozens or hundreds of times as well as the name of the package.

How does a user control translations?

To control translations you simply have two settings:

These two values are used, along with the hardcoded default fallback of en) to construct a list of languages to display to a user. This list is created in accordance of RFC4647 "Lookup Filtering Fallback Pattern".

What this means is that for every single entry, we continuously fallback to less and less specific locales of that language before moving onto the next option.

For example:

Our priority list of locales would be:

[
  'es-MX',
  'es',
  'zh-Hant-CN',
  'zh-Hant',
  'zh',
  'ja-JP',
  'ja',
  'en'
]

How is this priority list used?

When we load data from ./locales we only ever load a locale that is present on that above list. If a user had the same list I typed above but there was ./locales/pulsar.ar.cson it would never be loaded by the system at all, because the user wouldn't encounter that language during this fallback list.

But when we ask for any individual translation of a string, no matter if via the API or a LocaleLabel, we find the string we want to translate within i18n.strings first via it's keypath, then iterate through the fallback list and return the first match we get. This means that partial translation is completely supported for every single string, given that there is always a en locale translation available. This is an important point, anytime there is translation the base translations needed that must be 100% translated should be en, not en-US or en-GB or anything else.


From the above you can see I've given tried to cover every use case and made it as easy as possible to partially translate and start small.

These changes are 100% backwards compatible, and support having a single string translated or the entire application.

savetheclocktower commented 3 months ago

Is there any urgency to get this into 1.120, or are you cool with letting it sit for a while? I think it looks fine at first glance, but I'd love to have a few weeks to play around with it if you're open to that.

confused-Techie commented 3 months ago

@savetheclocktower I'm not against this one waiting around a bit. Obviously it'd be awesome to get it in sooner rather than later, but a month seems completely reasonable

confused-Techie commented 3 months ago

@meadowsys Good call on importing translations after the fact.

Since after making this PR I did take a look to realize how much has already been translated, and was worried about that work going to waste.

But you are right, pulsar-edit/i18n-intermediate-sync still exists, and we could just put all of this stuff into a new file, then during the step where file structure is modified we could combine the files and move any translated keys to other keys in the final format.

If (hopefully when) we get this merged, I'll get started over there to work all that out