completions: Support adding additional code to complete values

epage commented 2 years ago

Issue by joshtriplett Monday Jul 04, 2016 at 12:48 GMT Originally opened as https://github.com/clap-rs/clap/issues/568

For some argument values, the bash-completions may want to include additional logic for what type of value to complete, rather than allowing arbitrary strings. For instance, an option might accept a path to a file that exists; bash has a mechanism for that. Or, an option might accept the name of a git ref that exists; that's something more easily implemented in the program in Rust. Either way, it makes sense to augment clap's completion logic.

This also argues for implementing the completions by calling the program at runtime, rather than via shell; that way, arbitrary logic in Rust can determine what to complete, rather than providing a template system to feed in shell code (ick).

epage commented 2 years ago

Comment by kbknapp Monday Jul 04, 2016 at 21:35 GMT

Perhaps adding something like was discussed in in #376 where there is a "completer" function? I'm all for this, but figured it's also addable in a backwards compatible way once I had the base implementation complete.

I'm just not sure which would be the best way to add this so I'm open to all ideas.

epage commented 2 years ago

Comment by joshtriplett Monday Jul 04, 2016 at 22:38 GMT

I'm honestly not sure either. I'm really hesitant to suggest inlining shell script snippets into Rust code as strings; I'd rather see those written in a separate shell file and included from there (not least of which to get the right filetype and highlighting). bash (via compgen) and bash-completion (via functions in /usr/share/bash-completion/bash_completion) have some built-in helpers, and it'd be nice to support those for common cases like hostnames, users, and files (with glob patterns). Someone might also want to write arbitrary shell code to enumerate argument values. It'd also be nice to support using arbitrary Rust code by invoking the program.

I think what I'd suggest is that the .completer function should take an enum argument, where that enum has values like User, File, FileGlob("*.ext"), BashFunc("__comp_function_name"), and RustFunc(...). Those would then translate into appropriate calls to compgen, calls to the specified function, or invocations of the program to run Rust code. (That last one would also require something like a global_setting to enable a --clap-complete option or similar.)

This is turning out to be a remarkably hairy yak.

epage commented 2 years ago

Comment by mathstuf Sunday Oct 02, 2016 at 12:15 GMT

In zsh at least, clap could generate completion function calls such as:

(( $+functions[_appname_clap_complete_ARG] )) || _appname_clap_complete_ARG () {
}

Which can then be overridden in a supplemental file included before this one (via source if the file exists). Bash probably has some mechanism that works similarly.

epage commented 2 years ago

Comment by emk Monday Oct 10, 2016 at 15:31 GMT

I've just converted cage to use clap, and I'm very happy with the results. Basic completion works under both bash and fish. Great code!

But cage would benefit enormously from being able to dynamically complete the names of docker services for commands like:

cage test $SERVICE_NAME

If the project in the current directory has the ervices foo and frontend/bar I would like to be able to do the following:

> cage test f<TAB>
foo
frontend/bar

I would be happy to add an extra argument to the app, something like:

> cage --_complete-service f
foo
frontend/bar

And declare this as:

- SERVICE:
    value_name: "SERVICE"
    required: true
    complete_with: "_complete-service"
    help: "The name of the service in which to run a command"

Obviously the details could vary a bit, but we would ultimately have --_complete-pod, --_complete-service, --_complete-pod-or-service and --_complete-repo-alias, among others. Also note that many different subcommands would share each completion hook, which might mean we want these to be potentially global.

epage commented 2 years ago

Comment by kbknapp Monday Oct 10, 2016 at 19:15 GMT

@emk Your post has me thinking about this more and more, I'm thinking some sort of hybrid between what @joshtriplett listed above and what you're proposing.

My schedule is pretty busy this week, but I should be able to at least test some ideas and see the feasibility. Stay tuned to this thread for updates!

epage commented 2 years ago

Comment by emk Monday Oct 10, 2016 at 23:43 GMT

Another approach worth a quick glance might be optcomplete for Python: http://furius.ca/optcomplete/ As far as I can tell, this uses one small, universal shell script for each supported shell, and offloads all the actual completion work to the application's own arg parsing machinery.

I think another Python library just uses a '--_complete' on the program that does all the actual work, but I can't find it right now. I'll keep Googling around when I have moment and post anything interesting I find.

Thank you so much for a great library and for looking into this!

epage commented 2 years ago

Comment by emk Tuesday Oct 11, 2016 at 10:23 GMT

Ah, here we go. Some docs on how several Python arg parsing libraries handle --_completion.

selfcompletion is a layer on top of argparse to take the fine-grained model argparse builds of the arguments your program accepts and automatically generate an extra '--_completion' argument that generates all possible completions given a partial command line as a string.

The '--_completion' argument in turn is used by a generic bash programmable completion script that tries '--_completion' on any program that doesn't have its own completion already available, renders the output of the program's built-in completion if available, and otherwise silently falls back to the shell default.

Here is the generic completion function for bash:

_foo()
{
    prog="$1"
    while read glob_str; do
        case $prog in
        $glob_str)
            return 1;;
        esac
    done < <( echo "$SELFCOMPLETION_EXCLUSIONS" )
    which "$prog" >/dev/null || return 1
    _COMP_OUTPUTSTR="$( $prog --_completion "${COMP_WORDS[*]}" 2>/dev/null )"
    if test $? -ne 0; then
        return 1
    fi
    readarray -t COMPREPLY < <( echo -n "$_COMP_OUTPUTSTR" )
}

complete -o default -o nospace -D -F _foo

The advantage of this approach is that the per-shell code can be written only once, and all the hard work can be done directly by the application itself. Obviously, there might be disadvantages as well. But I figured it was worth tracking down all the existing attempts to standardize this to see if any of them had helpful ideas. :-)

epage commented 2 years ago

Comment by joshtriplett Monday Oct 24, 2016 at 21:40 GMT

@kbknapp Any updates on this mechanism? I have someone asking after completions, and I'd love to beta-test this.

epage commented 2 years ago

Comment by jcreekmore Tuesday Oct 25, 2016 at 17:35 GMT

@kbknapp I would be interested in this as well. I am currently post-processing my completions to substitute in _filedir for filename completion, but that is less than ideal.

epage commented 2 years ago

Comment by kbknapp Tuesday Oct 25, 2016 at 17:53 GMT

@joshtriplett @jcreekmore

Now that the ZSH implementation is complete I've got a better handle on this. The biggest issue I see holding this up is that completions are done differently between all three (so far) supported shells.

I'm all for some sort of enum with variants that allow things like, Files, Directories, Globs(&str), Code(&str) or something to that effect. But some shells support those things verbatim, others only in arbitrary ways that clap doesn't use when gen'ing the completion code.

I'm just unsure of the best way to expose this. Perhaps on an Arg::complete_with(enum)?

I guess, first what is the shell you're trying to support, and what particular portions are you wanting to inject into the completion code?

epage commented 2 years ago

Comment by emk Tuesday Oct 25, 2016 at 18:00 GMT

@kbknapp For cage, we'd like to be able to complete custom "types" of values, such as Docker container names, "pod" names, target environments, and so on. The legal values can only be determined by asking our executable at runtime, since they vary from project to project.

This is pretty much how git-completion handles origin names, branch names, etc.

The problem with an enum is that it would limit us to just a few built-in types such as Files, Directories, etc., right?

epage commented 2 years ago

Comment by joshtriplett Tuesday Oct 25, 2016 at 19:23 GMT

@kbknapp I don't think you need to support embedding arbitrary shell code from Rust. My suggestion would be to support the lowest-common-denominator bits (filenames, usernames, filenames matching one of these patterns, etc), and then have a "call this Rust function" variant that invokes the program itself with some internal hidden --clap-complete option that dispatches to that Rust function. That makes it easy to do things like "a git ref matching this pattern", by calling a Rust function implementing that.

For those common categories like filenames or usernames, use the shell built-in mechanisms if available, or provide and use code implementing those simple completions if the shell doesn't have them.

If people want "invoke this shell code", I'd suggest adding a Rust variant to call a named shell function, and then letting people add that shell function to the resulting generated completions for any shell they support. That seems preferable to embedding many variants of shell code directly.

enum Completion<F: Fn(...) -> ...> {
    File,
    User,
    Hostname,
    Strings(&[str]),
    Globs(&[str]),
    ShellFunction(&str), // maybe
    RustFunction(F),
}

epage commented 2 years ago

Comment by kbknapp Tuesday Oct 25, 2016 at 19:36 GMT

@emk Yes, and no. It would be extensible, so more variants could be added. But Some of the variants could also take additional parameters, and ultimately (possibly) injecting arbitrary shell script via something like ,Code(&str) which of course isn't super great, but perhaps a fallback if a particular variant doesn't quite fit the bill. (Or perhaps not...it could end up being massively unsafe :stuck_out_tongue_winking_eye: )

At the same time, I haven't looked into exactly where this code would be injected and ultimately if it's even feasible yet. This is just straight of the top of my head right now.

Also, if the "types" are know prior to runtime ZSH already supports this just by using the Arg::possible_values

epage commented 2 years ago

Comment by joshtriplett Tuesday Oct 25, 2016 at 19:54 GMT

That's true, "one of these fixed strings" should be an option as well. Updating the type in my previous comment.

epage commented 2 years ago

Comment by kbknapp Saturday Jan 14, 2017 at 02:25 GMT

From @Xion in #816

In Python, there is a package called argcomplete which provides very flexible autocompletion for apps that use the standard argparse module. What it allows is to implement a custom completion provider: essentially a piece of your own code that's executed when the the binary is invoked in a Special Way (tm) by the shell-specific completion script. For an example, see here. The code is preparing completions dynamically from the filesystem, or even from a remote API (if certain flag isn't passed (flags are partially parsed at this point)). Having something like this in clap would be very nice. I know this is a potentially complex subsystem so it'd be unreasonable to expect it implemented anytime, but I wanted to at least put this feature on the radar.

epage commented 2 years ago

Comment by kbknapp Saturday Jan 14, 2017 at 02:48 GMT

After reading through some of the argcomplete python module the hardest part will be figuring out how to call a Rust function from the shell.

epage commented 2 years ago

Comment by kbknapp Saturday Jan 14, 2017 at 02:52 GMT

I'm guessing what'll end up happening is some sort of double run with hidden args.

epage commented 2 years ago

Comment by kbknapp Thursday Jan 19, 2017 at 01:30 GMT

Expanding on the ideas (from #818)

The problem with implementing this is I just haven't had a good time to sit down and think about how (because of work, holidays, family, etc.). I want a way to specify this that abstracts well enough to work for all shells. The easiest way is to say, "Put your arbitrary completion shell script here inside this Arg::complete_with(&str)" but that feels super hacky to me, and potentially unsafe. What I'd like to do is provide a Arg::complete_with(Fn(&str, &str)->String) (and a Arg::complete_with_os(Fn(&str,&OsStr)->OsString) where an arbitrary Rust function is called...but herein lies the problem; shell completions are run before the program executes. This has led to some people using hidden args or something like, $ prog --complete me<tab> calling a shell completer that actually runs $ prog --_complete_arg "complete" --_complete_prefix "me" which generates the possible completions and returns them to the shell. I'm not against doing that, but again feels strange because you're injecting hidden args into a CLI. Although typing this out right now does make me lean towards this solution.

epage commented 2 years ago

Comment by joshtriplett Thursday Jan 19, 2017 at 06:21 GMT

@kbknapp I like the idea of using parameters like --_clap-complete and similar, and then calling the program itself to do the completion via Rust code. That would make it possible to (for instance) use git2-rs from Rust to complete names of things in a git repository.

Naming the argument/parameter seems sufficient to dispatch to the right function, though I'd also like to have the other arguments available to handle things like prog --repository /path/to/.git --branch someth[tab] (which needs the repository to complete branches from).

Also, those complete_with functions should return either a Vec<OsString> or in general an implementation of Iterator<OsString>, to return all completions. (The latter would benefit from but not require -> impl Trait support, since complete_with can declare the return value as a generic while the actual function/closure might use a specific iterator type.)

And what does the first &str parameter of those functions refer to?

epage commented 2 years ago

Comment by mathstuf Thursday Jan 19, 2017 at 14:29 GMT

I'd just like to note that for really expensive queries (imagine tab-completion for cargo install <Tab>), at least zsh supports a cache for completions; it would be nice to have a way to leverage that cache via some "can be cached" flag (cache expiry is controlled via zstyle).

epage commented 2 years ago

Comment by joshtriplett Thursday Jan 19, 2017 at 20:05 GMT

I'd just like to note that for really expensive queries (imagine tab-completion for cargo install <Tab>)

I'd expect tab-completion for cargo install <Tab> to list all the currently-available crates, without doing a network update first. That would use entirely local information, and generate completions quickly.

Rather than attempting to integrate with any particular shell's completion caching, perhaps if an app considers its completions expensive to generate, it could cache the necessary information to make them fast? Caching policies seem much easier to implement in Rust, since they could use arbitrary freshness metrics ("the underlying data hasn't changed so return the cached list").

epage commented 2 years ago

Comment by kbknapp Thursday Jan 19, 2017 at 21:31 GMT

@mathstuf @joshtriplett yeah, I'd prefer not to incorporate shell specific arguments if at all possible and leave that to the Rust function of the implementer.

And what does the first &str parameter of those functions refer to?

It referred to the current arg being parsed (as determined by clap), the second was the prefix being completed (if any). Whether it's a Vec<String> or String (including \ns) (or OsString equivalents) returned will be determined once I'm able look at all four shells and see what they're expecting.

I'd also like to have the other arguments available

I thought about this as well, and I'm torn between providing a simple list of strings (at which point why include them at all because people could just use std::env::args), or some magic about "allowing failed/incomplete parses" to give off a ArgMatches struct which I'd assume is actually the information you'd want but is more complex to implement.

epage commented 2 years ago

Comment by joshtriplett Thursday Jan 19, 2017 at 22:57 GMT

@kbknapp Parsing partial arguments (without enforcing all of the argument requirements) seems like a pain, but I'd rather not manually implement argument parsing in order to do completion. That said, implementing this without support for parsing other arguments at first would still help in many cases.

Does the "current arg being parsed" exist to allow passing the same completion function for multiple arguments, and distinguishing them via argument? If so, why not just do that using a closure? You could pass |arg| complete("foo", arg) easily enough. Completing multiple arguments with the same function but needing to distinguish between them by name seems sufficiently uncommon to not want to complicate the common case.

epage commented 2 years ago

Comment by kbknapp Saturday Jan 21, 2017 at 22:18 GMT

Parsing partial arguments (without enforcing all of the argument requirements) seems like a pain

Actually I don't think it will be. clap enforces all the requirements lazily, so a single branch would allow a failing parse to "pass" and only in that very strict circumstance. Of course this would be it'd have to be well documented that when using the ArgMatches given to the completing function, those requirements haven't been enforced yet and can't be relied on. But it would at least allow one to check for the presence of args, values, etc.

If so, why not just do that using a closure?

Actually I like that idea, I'll have to try this out when I get some time to sit down and try implementing this!

epage commented 2 years ago

Comment by droundy Friday Aug 11, 2017 at 12:50 GMT

I would suggest rather than an enum that multiple methods setting completion possibilities would be a better and more extensible API. So rather than

Args::complete_with(enum)

you would have

Args::complete_with_files()

and a whole set of other complete_with_XXX methods. This removes the requirement that you foresee every possibility on the first version of the API (since you can't add variants to a public enum without breaking backwards compatibility). I think the most important one is the one that uses an auto-generated flag to call rust code, since this can then implement all the others in a shell-independent way. Then optimizations could be made to do the others in a shell-dependent way if that seems to help. So I would focus on something like:

Args::complete_with_function(|matches_so_far| -> [String] { ... })

where it is important to provide the completed flags, since that can affect what is a valid completion for a given flag.

epage commented 2 years ago

Comment by droundy Friday Aug 11, 2017 at 12:59 GMT

Here is an example of a bash completion that simply calls command-line flags that return the possible completions for darcs. darcs does no shell-specific munging in its Haskell code (since it is usable by multiple shells), just outputs a '\n'-delimited list of completions. Then the bash code handles spaces and colons specially for bash.

epage commented 2 years ago

Comment by kbknapp Monday Nov 06, 2017 at 03:25 GMT

I'm copying some comments from @joshtriplett on gitter so the discussion is all in one place.

I see general consensus that we should have a way to call an arbitrary Rust function through a flag like --_clap-complete. We haven't talked about what that should take, but I would suggest that it needs 1) the partial argument being completed, 2) the rest of the arguments minus --_clap-complete (up to it to pass them to clap if desired). Beyond that, since some shells have built-in support for completing specific things and would likely do so faster than launching a program, we should also support files, files matching a glob, list of fixed strings, and maybe usernames/hostnames. Any of those that a shell we're generating completions for doesn't support would be easy enough to support in Rust.

As far as I can tell, the only items we don't have consensus on are 1) exactly what common-denominator of built-in completions do we support (e.g. usernames/hostnames?), and 2) how exactly do we support shell code. For the latter, personally I favor the "named shell function" approach, which has the advantage of being shell-independent, but I recognize that there are people who seem to want to embed full shell code. (I don't know how that could be made shell-independent, personally.)

Here's my responses:

I see general consensus that we should have a way to call an arbitrary Rust function through a flag like --_clap-complete. We haven't talked about what that should take, but I would suggest that it needs 1) the partial argument being completed, 2) the rest of the arguments minus --_clap-complete (up to it to pass them to clap if desired).

Agreed. However, I'm willing to "settle" for simply passing in argument to complete sans the argv, since the Rust code could essentially see the same thing by querying std::env::arg[_os] just like clap does. Another option is to allow "incomplete parsing" whenever --_clap-complete is used which would allow sending an ArgMatches struct to the Rust completion code. It would result in a double-parse to be sure, but I can't imagine that'd be a performance issue for anyone except the most critical perhaps "daemon mode" CLIs which I think is at odds with using completions in the first place 😜

Beyond that, since some shells have built-in support for completing specific things and would likely do so faster than launching a program, we should also support files, files matching a glob, list of fixed strings, and maybe usernames/hostnames. Any of those that a shell we're generating completions for doesn't support would be easy enough to support in Rust.

Also agreed. I'm thinking I'd like to expose this as a sort of enum where the implementation allows checking the shell which we're outputting for and either using the shell builtins or augmenting the missing parts with our own code.

1) exactly what common-denominator of built-in completions do we support (e.g. usernames/hostnames?)

Correct. I don't think this needs to be fully hashed out though, as we can always add. For now, files is a good starting point. List of predetermined values is possible-ish today (in some shells, but we'd need to augment in the lacking shells) via possible_values, however that would probably be another easy win for us to support a list of predefined values determined at completion time. For users/hostnames I'm fine adding it, however I don't have strong feelings about if it should be included as a first implementation or added later.

2) how exactly do we support shell code. For the latter, personally I favor the "named shell function" approach, which has the advantage of being shell-independent, but I recognize that there are people who seem to want to embed full shell code. (I don't know how that could be made shell-independent, personally.)

The ways I see are allowing the user to send shell specific code up front a la BashShellCode("blah blah blah"), FishShellCode("bam bam bam") etc, or perhaps with a Rust function which we send a clap::Shell variant to and they are responsible for sending back valid code such as ShellCode(Fn(clap::Shell)->String).

More generally, my goal is to pull the completion script generation out of clap proper and into a clap_completions crate at 3.x. I had made quite a few changes on the 3.x branch which would make implementing all this far easier and more correct, but it looks like I'm going to have to partially scrap the current 3.x branch due to some recent changes, and ideas I'd rather incorporate moving forward.

What that means is I don't want to get too into the weeds implementing ideas we reach here on the current 2.x branch only to have massive changes in 3.x. I do want to reach a consensus though, or agree upon a foundation API which could be implemented on the 3.x branch.

Having said that, if someone puts time into actual implementation on the 2.x branch I'd be more than happy to include it. I just don't personally have the time to pour into separate 2.x and 3.x implementations.

epage commented 2 years ago

Comment by kpcyrd Wednesday Jan 03, 2018 at 04:02 GMT

I think it's important that the function that is called by .get_matches() needs to act like a 2nd main. There are some usecases where initialization is needed even in this case, for example to make sure the sandbox is active even in those cases. I don't think this is going to be an issue, just my 2 cents. :)

I'm about to add completion to two of my programs, mostly because the values that the user is usually trying to complete are hard to type and would require looking them up manually.

I think a minimal invasive 2.0 compatible solution that would already work for most of us would be something along the lines of:

.arg(Arg::with_name("foo")
    .complete_with_cmd(&["myprog", "internal", "something"])
)

This would instruct the shell to execute myprog internal something abc when the user types abc<TAB> for that argument. The output would be a \n-delimited list as @droundy suggested.

For now, this would require writing additional subcommands, but that code can be re-used for a more advanced solution that requires breaking changes to clap.

If this is acceptable I would try to prepare a patch, completion is crucial for one of those programs and I would either have to maintain the tab completion code on my own or fallback to post-processing as well.

epage commented 2 years ago

Comment by softprops Tuesday Jun 05, 2018 at 03:20 GMT

Here's some inspiration from how golang kingpin package handles dynamic tab completion using the bash COMPREPLY bash protocol

Kingpin supports dynamic completions via it's HintAction interface

I imagine this should be possible providing a fn interface to clap's Arg type for dynamic completions

Under the covers kingpin forks program control when in completion mode to invoke that completion func then exits the process.

It switches modes of operation based on a flag that the completion script passes.

epage commented 2 years ago

Comment by adamtulinius Thursday Sep 06, 2018 at 09:19 GMT

Here's some inspiration from how golang kingpin package handles [..]

Please not that kingpin currently can't complete file paths, which need something like https://github.com/alecthomas/kingpin/compare/master...DBCDK:hint-files to fix. I'm just mentioning this, because it required work on the compreply stuff, and might be useful here as well.

epage / clapng

completions: Support adding additional code to complete values #65