leolabs / json-autotranslate

Translate a folder of JSON files containing translations into multiple languages.
MIT License
139 stars 43 forks source link
azure-translator deepl google-translate i18n i18next icu localization translation

json-autotranslate

This tool allows you to translate a locale folder containing multiple JSON files into multiple languages using Google Translate, DeepL (free/pro), Azure Translator, Amazon Translate, or manually. You can either use the translation keys (natural translation) or their values (key-based translation) as a source for translations.

If some of the strings have already been translated, they won't be translated again. This improves performance and ensures that you won't accidentally lose existing translations.

Interpolations (ICU: {name}, i18next: {{name}}, sprintf: %s) are replaced by placeholders (e.g. <0 />) before being passed to the translation service, so their structure doesn't get mangled by the translation.

Installation

$ yarn add json-autotranslate
# or
$ npm i -S json-autotranslate

Running json-autotranslate

$ yarn json-autotranslate
# or
$ npx json-autotranslate

Usage Examples

Translate natural language source files located in the locales directory using Google Translate and delete existing keys in translated JSON files that are no longer used.

$ yarn json-autotranslate -i locales -d -c service-account.json

Manually translate key-based source files located in the locales directory.

$ yarn json-autotranslate -i locales -s manual

Directory Structure

You can specify your locales/i18n directory structure using the --directory-structure option.

Default

locales
├── de
├── en
│   ├── login.json
│   └── register.json
├── fr
└── it

If you don't specify another source language, this tool will translate all files located in the en into all other languages that exist as directories. A single language directory (e.g. en) should only contain JSON files. Sub-directories and other files will be ignored.

Ngx-translate

i18n
├── de.json
├── en.json
├── fr.json
└── it.json

If you don't specify another source language, this tool will translate en.json into all other languages that exist as files. The i18n directory should only contain JSON files. Sub-directories and other files will be ignored.

File Structure

There are two ways that json-autotranslate can interpret files:

If you don't specify a file structure type, json-autotranslate will automatically determine the type on a per-file basis. In most cases, this is sufficient.

Natural Language

This is the default way that this tool will interpret your source files. The keys will be used as the basis of translations. If one or more of the values in your source files don't match their respective key, you'll see a warning as this could indicate an inconsistency in your translations. You can fix those inconsistencies by passing the --fix-inconsistencies flag.

{
  "Your username doesn't exist.": "Your username doesn't exist.",
  "{email} is not a valid email address.": "{email} is not a valid email address."
}

Key-Based

If you pass use the key-based option (--type key-based), this tool will use the source file's values as the basis of translations. Keys can be nested, the structure will be transferred over to the translated files as well.

{
  "ERRORS": {
    "USERNAME": "Your username doesn't exist.",
    "EMAIL": "{email} is not a valid email address."
  },
  "LOGIN": "Login",
  "FORGOT_PASSWORD": "Forgot password?"
}

Available Services

As of this release, json-autotranslate offers five services:

You can select a service using the -s or --service option. If you specify the --list-services flag, json-autotranslate will output a list of all available services.

Google Translate

To use this tool with Google Translate, you need to obtain valid credentials from Google. Follow these steps to get them:

  1. Select or create a Cloud Platform project
  2. Enable billing for your project (optional, I think)
  3. [Enable the Google Cloud Translation API][enable_api]
  4. Set up authentication with a service account so you can access the API from your local workstation

[enable_api]: https://console.cloud.google.com/flows/enableapi?apiid=translate.googleapis.com

You can specify the location of your downloaded JSON key file using the -c or --config option.

DeepL

To use this tool with DeepL, you need to obtain an API key from their website. If you don't have a Developer account yet, you can create one here.

DeepL Pro charges a fixed monthly price plus a variable fee for every 500 translated characters.

DeepL Free is limited to 500,000 characters translated per month.

After you have completed your sign-up, you can pass the API key to json-autotranslate using the -c or --config option.

The value of the --config argument is a comma separated string with the following: appKey,formality,batchSize.

The formality argument currently only works for target languages "DE" (German), "FR" (French), "IT" (Italian), "ES" (Spanish), "NL" (Dutch), "PL" (Polish), "PT-PT", "PT-BR" (Portuguese) and "RU" (Russian). Possible options are:

To improve performance and prevent DeepL rate-limiting json-autotranslate batches multiple tokens into a single translation request. By default, the batchSize is set to 1000, meaning that 1000 tokens are translated at once. This can be controlled by adjusting the value in the --config parameter. This value was chosen because the DeepL prevents the body of a request to be larger than `128 KiB (128 · 1024 bytes)``. Based on experimentation, even with long tokens, this limit is not reached.

Reference

Azure Translator Text

To use this tool with Azure's Translator Text, you need to obtain an API key from their website. Sign Up for an Azure account if you don't have one already and create a new translator instance. You'll get an API key soon after that which you can pass to json-autotranslate using the -c or --config flag.

Unless you configure a global translator instance you will need to provide a region by adding it to the config string after the API key, separated by a comma: --config apiKey,region. As of this version, the following regions are available:

australiaeast, brazilsouth, canadacentral, centralindia, centralus, centraluseuap, eastasia, eastus, eastus2, francecentral, japaneast, japanwest, koreacentral, northcentralus, northeurope, southcentralus, southeastasia, uksouth, westcentralus, westeurope, westus, westus2, and southafricanorth

Reference

As of now, the first 2M characters of translation per month are free. After that you'll have to pay \$10 per 1M characters that you translate. See their pricing

Amazon Translate

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "comprehend:DetectDominantLanguage",
                "translate:TranslateText"
            ],
            "Resource": "*"
        }
    ]
}

You can provide a path to the json configuration file via the --config flag. You may define any properties from TranslateClientConfig and they will be passed as the first argument to the Translate Constructor. At a minimum, this must include the AWS region.

Amazon Translate offers a free tier, but is paid after that. See their pricing page for details.

Manual

This service doesn't require any configuration. You will be prompted to translate the source strings manually in the console.

Available Matchers

Matchers are used to replace interpolations with placeholders before they are sent to the translation service. This ensures that interpolations don't get scrambled in the process. As of this release, json-autotranslate offers four matchers for different styles of interpolation:

You can select a matchers using the -m or --matcher option. If you specify the --list-matchers flag, json-autotranslate will output a list of all available matchers.

Available Options

Options:
  -i, --input <inputDir>                         the directory containing language directories (default: ".")
  --cache <cacheDir>                             set the cache directory (default: ".json-autotranslate-cache")
  -l, --source-language <sourceLang>             specify the source language (default: "en")
  -t, --type <key-based|natural|auto>            specify the file structure type (default: "auto")
  -a, --with-arrays                              enables support for arrays in files, but removes support for keys named 0, 1, 2, etc.
  -s, --service <service>                        selects the service to be used for translation (default: "google-translate")
  -g, --glossaries [glossariesDir]               set the glossaries folder to be used by DeepL. Keep empty for automatic determination of matching glossary
  -a, --appName <appName>                        specify the name of your app to distinguish DeepL glossaries (if sharing an API key between multiple projects) (default: "json-autotranslate")
  --context <context>                            set the context that is used by DeepL for translations
  --list-services                                outputs a list of available services
  -m, --matcher <matcher>                        selects the matcher to be used for interpolations (default: "icu")
  --list-matchers                                outputs a list of available matchers
  -c, --config <value>                           supply a config parameter (e.g. path to key file) to the translation service
  -f, --fix-inconsistencies                      automatically fixes inconsistent key-value pairs by setting the value to the key
  -d, --delete-unused-strings                    deletes strings in translation files that don't exist in the template
  --directory-structure <default|ngx-translate>  the locale directory structure
  --decode-escapes                               decodes escaped HTML entities like &#39; into normal UTF-8 characters
  -h, --help                                     display help for command

Contributing

If you'd like to contribute to this project, please feel free to open a pull request.