tj / commander.js

node.js command-line interfaces made easy
MIT License
26.69k stars 1.69k forks source link

Add localization method to Command #1801

Open gylove1994 opened 2 years ago

gylove1994 commented 2 years ago

Even though we can use .configureOutput() to configure the output to change the help header and error, but it is not convenient for user to change it entirely ( include error message and help header and etc. ) for localization.

I want to add a method to Command like program.localization('zh_CN') to change the output template words and error message could be Simplified Chinese entirely.

If it is passable to add , I will try to make a PR then.

Related Issues: #128 #774 #1608

Edit from @shadowspawn to readers: please đź‘Ť this comment if you would like localisation support or support for customising strings added to Commander. You don't need to read all the comments!

shadowspawn commented 2 years ago

PR related to #128 is #131

More related issues: #486 #1498

I think there were multiple requests to set version and help description for localisation, but could only find one comment:

FYI: @haochuan9421

shadowspawn commented 2 years ago

I am interested in localisation, but I think this is a hard problem that will require a lot of code changes and make the code more difficult to write and maintain. So only a maybe, and worth discussing before investing a lot of time in a PR. Somewhat to my surprise, #128 did not get many likes or comments despite being open for six years. (If any readers are interested in localised strings for Commander, please đź‘Ť the opening comment here so we know there is interest!)

I do not have experience with localisation in JavaScript.

1) I have looked at it casually in the past, and wasn't sure how to handle building error messages. The code uses the convenient template literals a lot, like:

    const message = `error: too many arguments${forSubcommand}. Expected ${expected} argument${s} but got ${receivedArgs.length}.`;

Have you got an idea of how to handle these? (The parameters, not the plural. That is a separate issue on its own.)

In other languages I have used formatting support that allows numbered parameters so translations may change the order of the substituted parameters, like say "too many arguments %1. Expected %2". But I have not found anything native in Node.js that does this.

2) If we add localisation, then we want people to be able to share them!

We did add an organisation recently, so now have somewhere that we could host localisation projects: https://github.com/commander-js

[Edit: I currently think we might ship localisations as part of Commander rather than separate packages.]

gylove1994 commented 2 years ago

Sorry for my slow reply.

It does need to change a lot of code to achieve it, so I will be waiting to handle the next step after the discussion result come out.

For the issue which how to handle error messages, I found some discussions here: javascript-equivalent-to-printf-string-format

shadowspawn commented 2 years ago

The keys for the localised strings could be the original text, or an identifier.

{ 
   "Hello, world!": "¡Hola Mundo!" // english original as key
}
{
   helloWorld: "¡Hola Mundo!" // identifier as key
}

I slightly prefer the english original as the key, so can see the text and parameters all at once when used. Although for the errors, we do already have the code like commander.missingArgument.

Reference: https://www.transifex.com/blog/2015/naming-string-identifiers-best-practice/

shadowspawn commented 2 years ago

Numbers and plurals are tricky in general, but I don't think there are many in Commander, perhaps just one:

    const message = `error: too many arguments${forSubcommand}. Expected ${expected} argument${s} but got ${receivedArgs.length}.`;

For interest, the Unicode CLDR has tags for: zero, one,two, few, many, and other.

shadowspawn commented 2 years ago

For interest, Yargs uses original string as key, and uses plural tags for (just) one and other:

https://github.com/yargs/yargs/blob/main/locales/en.json

shadowspawn commented 2 years ago

Trying out numbered and named parameters:

const templated = `error: missing required argument '${name}'`;
const numbered = localise("error: missing required argument '{0}'", name);
const named = localise("error: missing required argument '{name}'", { name });

// or with code as key, and numbered
const coded = localise('commander.missingArgument', name);

[Edit, added] Tagged Template per https://github.com/tj/commander.js/issues/1801#issuecomment-1250608660

const taggedTemplate = localise`error: missing required argument '${name}'`;
shadowspawn commented 2 years ago

There are lots of interpolation conventions!

This guide includes three different examples:

We have simple message strings and don't need extra characters to avoid false matches, so {username} should be fine.

shadowspawn commented 2 years ago

This is not quite all the strings, but most of them.

Help

suggestSimilar

version and help

commander.invalidArgument for argument

commander.invalidArgument for Option

commander.missingArgument

commander.optionMissingArgument

commander.missingMandatoryOptionValue

commander.conflictingOption

commander.unknownOption

commander.excessArguments

commander.unknownCommand


Is it worth localising the author errors too? I guess so.

throw new Error

gylove1994 commented 2 years ago

I think I found an appropriate way to deal with interpolation using Tagged templates.

The advantage of using tagged templates:

It works by adding a tag to the original template literals:

    const message = localization`error: unknown command '${unknownName}'${suggestion}`;

And all done.

Here have a demo in typescript: https://stackblitz.com/edit/typescript-qmax2v?devToolsHeight=33&file=index.ts

shadowspawn commented 2 years ago

Ah, I have seen Tagged Templates in past, but didn't understand how it would work here. The implementation is a bit subtle and the calling convention is a little unfamiliar at first, but does keep the code very similar using template to construct the string. Interesting.

I think the localisation keys will have to use numbers, I think the names get lost in the process. e.g. error: missing required argument {1}

An overview and example: https://javascript.plainenglish.io/template-literals-and-a-practical-use-of-tagged-templates-58526d525d72

shadowspawn commented 2 years ago

At least initially, would the localisation be an opt-in behaviour, or automatic?

I can see a few advantages to opt-in:

shadowspawn commented 2 years ago

A reminder to myself. Localisation for the error messages will mean there is a better solution than overriding the private routines like optionMissingArgument to change the text, so a good time to eliminate them or at least rename to start with a leading underscore so clearer they are private.

gylove1994 commented 2 years ago

Sorry for my slow reply.

I am not quite sure I understood your thinking.

Do you want localisation can follow the user's PC locale preference automatically?

I think it's a good idea, I will try to find a way to achieve it.

By the way, I change the keys to use numbers like yours e.g. error: missing required argument {1} and I hosted a live demo to show how it works here: demo.

shadowspawn commented 2 years ago

Do you want localisation can follow the user's PC locale preference automatically?

I was suggesting we do not do automatic. Require author to select localisation.

But I am interested in how automatic might work! For reference, Sindre Sorhus writes lots of small utility packages and has one for os-locale. My quick look was that without spawning a command, it just uses environment variables, which I suspect does not work in native Windows shell.

Two problems I can see with automatic locale which do not occur if author sets locale to match their code:

shadowspawn commented 2 years ago

I hosted a live demo to show how it works here

Neat!

Note that you can construct the lookup key from the string array passed to localization without referring to the localisations. Then look up the key in the map. If the key is not found, you can use the key itself as the template. (We might still want to lookup locale first, then English, then fallback to the key. To pickup minor fixes to the English.)

(Does not matter if my explanation is too vague to understand! The code details don't matter yet.)

shadowspawn commented 2 years ago

Is there value in letting people supply a partial or complete localisation map themselves, rather than only pick from supplied locales? This would let people change (say) just the help titles. Perhaps to add colour.

shadowspawn commented 2 years ago

I am excited. 🎉

I am still worried about the code changes and maintenance of the translations, but if there is interest from users then I think it could be worth it.

shadowspawn commented 2 years ago

Sorry for my slow reply.

Two days? No problem! Please do not feel under pressure to reply quickly.

gylove1994 commented 2 years ago

Is there value in letting people supply a partial or complete localisation map themselves, rather than only pick from supplied locales? This would let people change (say) just the help titles. Perhaps to add colour.

Yes, that is the same thing I thought!

With the translationMap( But I think it is time to change its name now because we extended its usage ) and the tagged templates, we can give the authors an easy way to fully control template strings and the error messages directly compare to using configureOutput() to modify it.

gylove1994 commented 2 years ago

Neat!

I will do it right away, and I will try to add some new thoughts to it!

Thanks for your patience to read it and gave me suggestions!

gylove1994 commented 2 years ago

The new demo is ready: https://stackblitz.com/edit/typescript-qmax2v?devToolsHeight=33&file=types.ts

Mainly neat and added a customTransform() method and change some names.

gylove1994 commented 2 years ago

I found angular also use Tagged templates to do localize: https://angular.io/api/localize

shadowspawn commented 2 years ago

Another package with an API for locale support, both locales and custom strings: https://day.js.org/docs/en/customization/customization

shadowspawn commented 2 years ago

I am thinking about what to try first. Being able to customise the strings would let people do things themselves without waiting for a full locale/localisation to be available. And does not require us deciding how to store the localisations and when they are loaded, and fallback et al.

1) I am thinking perhaps a routine called .configureStrings(), following the naming of .configureOutput() and .configureHelp(). This would take an object with replacement templates/strings (the TemplateStrings type in the demo.)

2) Probably start with some error messages, because they are fairly tidy and self-contained. (Or the Help strings would be fun seeing lots at once, but wouldn't use tagged template as much.)

I am not sure if the support should be built straight into Command or into another class. I guess start with adding to Command and see whether it takes more than a single property and routine.

shadowspawn commented 2 years ago

Using several of the references we found above, I came up with this template function:

  transform(strings, ...args) {
    // Make a key by putting in placeholders for the args, like {0}.
    const key = strings.reduce((accumulator, str, index) => {
      return accumulator + `{${index - 1}}` + str;
    });

    // Look up possible replacement.
    const template = this.replacements[key] ?? key;

    // Fill in the placeholders with the actual args.
    return template.replace(/{(\d+)}/g, (match, digits) => {
      return `${args[digits]}`; // paranoia: convert arg to string
    });
  }
shadowspawn commented 2 years ago

I opened a PR with some experimental work: #1807

I have added calls to the tagged template function for most of the error messages. I have not done the one that needs rethinking due to pasting together pieces of text including plurals.

Added en.js but currently just to collect together the key strings and notes on issues. My current focus is just on supporting the strings. I haven't looked deeper into file naming or file types or loading patterns for locales. (Used .js rather than .json so could add comments.)

shadowspawn commented 2 years ago

In a few places we display lists. I came across two in the Help, for the choices, and the list of "extra" info for an option. I've added a "list separator" entry in the strings, which is simple but might be too simple to be useful.

For option like:

program
  .addOption(new Option('--every [value]', 'description').default('my-default').preset('my-preset').choices(['a', 'b']).env('MY_ENV'));

Help shows:

Options:
  --every [value]  description (choices: "a", "b", default: "my-default", preset: "my-preset", env: MY_ENV)

So changing to semicolon:

program
  .configureStrings({ ', ': '; ');
...

Help shows:

Options:
  --every [value]  description (choices: "a"; "b"; default: "my-default"; preset: "my-preset"; env: MY_ENV)

Update: found Intl.ListFormat and planning to look at that instead

shadowspawn commented 2 years ago

For interest. Some of the Help routines need the tagged template localisation function. Everywhere except help.optionDescription() already had a Command available, so I just added a command as an extra parameter to optionDescription().

shadowspawn commented 2 years ago

I just found interesting early work in: https://github.com/gylove1994/commander.js/commit/cdecfe8955aedf7809a18d000d994e2550e0e8cd

I was interested to see .localization(lang) and customLocalization(localizationMap). I have been wondering whether the names should be similar like this.

At the moment in my branch I have .configureStrings() and I think I would call the locale routine .locale(lang).