spf13 / cobra

A Commander for modern Go CLI interactions
https://cobra.dev
Apache License 2.0
37.65k stars 2.83k forks source link

Multi-language support for CLIs? #1134

Open thomasgloe opened 4 years ago

thomasgloe commented 4 years ago

I would like to support multiple languages for my CLI using cobra. Implementation for commands is no problem, but is it correct that there is currently no support for the text output generated by cobra itself (e.g., "Usage", "Flags", "Use "mycmd [command] --help" for more information about a command.")?

BunnyBrewery commented 4 years ago

Are you talking about if there is multi-language support for default help message in Cobra?

thomasgloe commented 4 years ago

Yes and I've already checked the source code, where strings are encapsulated in the UsageTemplate. The way to go seems to change the usage template with SetUsageTemplate.

If I have enough time, would it be of interest to include a small example in the docs?

github-actions[bot] commented 4 years ago

This issue is being marked as stale due to a long period of inactivity

jharshman commented 3 years ago

@thomasgloe I'd be interested to see a PR for this if you wanted to take a shot at it.

github-actions[bot] commented 3 years ago

This issue is being marked as stale due to a long period of inactivity

github-actions[bot] commented 3 years ago

This issue is being marked as stale due to a long period of inactivity

hitzhangjie commented 3 years ago

SetsageTemplate may only affects the template. If we want to support multiple languages, we may consider the description of commands and flags.

I use go-i18n to support multiple languages in my cobra cli.

Goutte commented 1 year ago

I've reviewed cobra these past days (using it for git spend), and I've come to the conclusion that cobra itself should have some form of i18n for default content.

I'm glad you're not against it for arcane reasons :) – it's just work, and this I understand.

There are some decisions that are best discussed beforehand, though.

Choosing a translation file format

I'm partial to toml in our case, since we won't really need the tree structure of yaml. The other formats are just not human-friendly enough, and even though there are really nice GUIs for translation, I prefer keeping the translation files as readable as possible.

Embedding toml translation files

This appears to be the easy way of handling i18n.

Embedding (go:embed) ALL translations may add some kilobytes to cobra. I'm okay with it, personally, but some of y'all may know ways (I don't) to distribute "lightweight" versions of cobra (with only the english file), along with the fully translated one, for people who desperately need lightweight.

Embedding also kind of slightly breaks the philosophy of package managers (.deb), since each cobra-based cli app will end up with its own translation files for the internals of cobra, whereas they could be shared, ideally. This is tricky.

of note: go:embed requires go1.16 or later

goi18n extract or not?

Usage of goi18n extract requires writing the translation fetching code very verbosely, and adding the english default right there in the code. This would make cobra a bit harder to read and much more verbose, but I see ways to mitigate that (creating a function for each translation string, and "hiding" the verbose fetch in those, keeping the rest of cobra free of the clutter)

The alternative is to do something trivial like locale.T("HelpTemplate") everywhere, which goi18n extract won't understand. ('tis what I've done in git-spend)

I prefer the solution where we'd support goi18n extract, especially because then we could more easily disable the whole embedding of translation files and still have english working as fallback.


Currently trying to devise a PoC for this so we have a more concrete example to decide upon.

Goutte commented 1 year ago

Here's a draft of what it would look like:

localizer.go

package cobra

import (
    "embed"
    "fmt"
    "github.com/BurntSushi/toml"
    "github.com/nicksnyder/go-i18n/v2/i18n"
    "golang.org/x/text/language"
)

var defaultLanguage = language.English

// localeFS points to an embedded filesystem of TOML translation files
//
//go:embed translations/*.toml
var localeFS embed.FS

// Localizer can be used to fetch localized messages
var localizer *i18n.Localizer

func i18nError() string {
    return localizeMessage(&i18n.Message{
        ID:          "Error",
        Description: "prefix of error messages",
        Other:       "Error",
    })
}

func i18nExclusiveFlagsValidationError() string {
    return localizeMessage(&i18n.Message{
        ID:          "ExclusiveFlagsValidationError",
        Description: "error shown when multiple exclusive flags are provided (group flags, offending flags)",
        Other:       "if any flags in the group [%v] are set none of the others can be; %v were all set",
    })
}

// … lots more translations here

func localizeMessage(message *i18n.Message) string {
    localizedValue, err := localizer.Localize(&i18n.LocalizeConfig{
        DefaultMessage: message,
    })
    if err != nil {
        return message.Other
    }

    return localizedValue
}

func loadTranslationFiles(bundle *i18n.Bundle, langs []string) {
    for _, lang := range langs {
        _, _ = bundle.LoadMessageFileFS(localeFS, fmt.Sprintf("translations/main.%s.toml", lang))
    }
}

func init() {
    bundle := i18n.NewBundle(defaultLanguage)
    bundle.RegisterUnmarshalFunc("toml", toml.Unmarshal)

    // FIXME: detect lang(s) from env (LANGUAGE > LC_ALL > LANG)
    detectedLangs := []string{
        "fr",
        "en",
    }

    loadTranslationFiles(bundle, detectedLangs)
    localizer = i18n.NewLocalizer(bundle, detectedLangs...)
}

It uses init(), as I'm not yet intimate enough with cobra to know where to properly hook initialization.

Goutte commented 1 year ago

Draft continues in the feat-i18n branch.

I'm not fond of how I added i18n in the command Usage template, but my goal is to keep backwards compatibility.

Used composition, at the cost of a runtime copy of a Command instance, but we keep the same API in the template and don't have to expose an additional property in Command.

Goutte commented 1 year ago

Well… works for me ! I've registered a MR draft.

There's a bunch of things I'm not comfortable with, let's discuss those in #1944

phw commented 9 months ago

I prefer the solution where we'd support goi18n extract, especially because then we could more easily disable the whole embedding of translation files and still have english working as fallback.

What about using gotext instead? It does not as verbose code as go-i18n, and you can use a simple wrapper function like you have show and gotext still manages to extract the texts. Translation foles are JSON, not TOML, though. But I think they are still rather easy to handle.

Goutte commented 9 months ago

Thanks for the suggestion, @phw . I remember, at the time of choosing, I saw JSON, facepalmed, sighed, and went on my way.

Here's what the goi18n lib says it provides :

  1. Supports pluralized strings for all 200+ languages in the Unicode Common Locale Data Repository (CLDR).

    We don't use this, I believe.

  2. Supports strings with named variables using text/template syntax.

    This is very handy when injected words ought to be in different order in some translations. But we can perhaps do without, for simplicity's sake.

  3. Supports message files of any format (e.g. JSON, TOML, YAML).

    I profoundly dislike having to edit JSON by hand. TOML feels nice, but it's not even the best option. gettext files (PO, MO) would be my preferred choice.
    There's a (quite new) go-i18n lib that promises to do just this, but it does not look like it is finished yet.


All in all, I don't mind ditching the goi18n lib, but :

phw commented 9 months ago

Just to avoid confusion further down (@Goutte understood me correctly): I was referring to golang.org/x/text/message with the golang.org/x/text/cmd/gotext CLI utility to extract text

golang.org/x/text does support both pluralization and changing variable order. Actually it has one of the nicest implementations for this where the developer basically does not need to think about it. If you have a translatable string like this:

printer.Sprintf("%s copied %d files to %s", user, count, dest)

There will be a translation string like "{User} copied {Count} files to {Dest}". The translator can reorder the placeholders however they see fit and it will be used correctly.

It is limited to the JSON format though, and it also has this a bit convoluted concept with separate out.gotext.json and messages.gotext.json. But a tool like Weblate can actually deal with both those issues.

If you want to go the gettext route have a look at https://github.com/leonelquinteros/gotext . This is also a gettext implementation. Pure go, so no actual dependency on gettext libraries. It also provides an extraction tool github.com/leonelquinteros/gotext/cli/xgotext .

I haven't used it yet, but it looks nice. What originally discouraged me from using it was that it does not directly provide the option to load the translation files from go:embed. But according to the discussion at https://github.com/leonelquinteros/gotext/issues/52 the library's API is flexible enough to allow this.

With gettext you definitely get the best tooling for translators.

What I really dislike about github.com/nicksnyder/go-i18n is the verbosity it requires for each translatable string without the ability to add an abstraction over this that fits your application (at least not without breaking string extraction, which I consider mandatory to have).

phw commented 9 months ago

3. There's a (quite new) go-i18n lib that promises to do just this, but it does not look like it is finished yet.

Just saw that this is actually using github.com/leonelquinteros/gotext, but adds the ability to embed the translation files on top.

Goutte commented 9 months ago

Thanks @phw for the clarifications !

I think you're right, it's worth implementing this your way.

I'll start another branch with the ubuntu lib, unless you want to hack around and kickstart things.


One thing I really don't understand about x/text is that it requires x/tools and in turn the net, crypto and goldmark packages. Insofar as I understand, they are used for the CLI (dev) utilities ; it feels wrong to add those to cobra just for i18n.

If I understand correctly, those are essentially removed at compile-time since nothing will link to them, but still... Does not feel right.

Goutte commented 9 months ago

Ooops.

The ubuntu lib requires Go 1.20 ; embedding requires 1.16 I believe. Cobra is 1.15 right now.

I'll try to shoot straight for https://github.com/leonelquinteros/gotext and some glue for embedded PO files.

Goutte commented 9 months ago

A few notes after hacking around with gotext :

  1. There's no way to describe a translation string to help translators, which is something nice, but not mandatory given the low amount of translations that we have. Furthermore, there are contexts in gotext that aim to solve this. Even though I find descriptors more elegant and humane, we can live with this.
  2. The xgotext CLI only detects gotext.Get(…) and not for example GetLocale.Get(…) which means we cannot lazy-load the locale, we have to initialize it before it is used anywhere, so probably in init(). I'm told usage of init() is frowned upon in Golang.
Goutte commented 9 months ago

Made a draft in #2090 @phw :rocket: