c-blake / cligen

Nim library to infer/generate command-line-interfaces / option / argument parsing; Docs at
https://c-blake.github.io/cligen/
ISC License
496 stars 23 forks source link

turning off prefix completion? #216

Closed ckp95 closed 2 years ago

ckp95 commented 2 years ago

Hi, I'm trying to evaluate whether this is suitable for some of my projects.

It looks really promising, except for the feature that long options can be specified just with any unambiguous prefix. Is there a way to turn this off? I don't like the sound of it. It introduces the danger that removing an option can cause a script to silently use a different, unintended option, instead of immediately exiting with an error.

I would rather my programs just fail unless you give a properly specified command.

c-blake commented 2 years ago

There is no such tunability by CLauthors at present. I do not think it would be very hard to add, though. (EDIT: this was incorrect. Feature exists. See final comment.) One would probably want separate switches for the long options, enums, and subcommands. This also came up over at https://github.com/c-blake/cligen/issues/158

(The enums part is already overridable without my intervention by bypassing the standard cligen/argcvt).

It does feel more like a user-level decision (say in the main cligen config file) than a command author decision to me, though. Is it the user who wants to force himself to see failures/type whole tokens or the CLauthor who wants to force the user? Only the actual end user can calibrate/scale the effort, knows if they use it in scripts, or knows how often they might upgrade a tool (which is the only place risk arises). So, the value of the abbreviation just feels very personal/environmental. Heck, maybe they have one sticky keyboard and others not so sticky and even the same user could be of two minds in two environments...

Thoughts?

ckp95 commented 2 years ago

When you say "user", do you mean the user of this library (i.e. the author of some CLI program), or the end-users of the CLI programs themselves?

c-blake commented 2 years ago

Were you to do a PR on this, you may personally be all roles: cligen-helper-author, cli utility author, and user - just at different times...Oh, and there you go asking! :-)

c-blake commented 2 years ago

I usually say CLauthor for the one who writes the utility and CLuser for the one who invokes it out of a script/command-line prompt. Then cligen is this kind of meta-level toolkit thing.

c-blake commented 2 years ago

I would guess some new CLuser control in $HOME/.cligen/config would be enough for you, but I did not want to presume. Some people like to force their strictness on others. Anyway, the easiest thing is probably just adding a few bool toggles to the cligen.ClCfg object and then checking those toggles in the right places in long-option/subcommand/enum parsing. But I'm not sure there is a pre-existing pathway for those parsing places to access the clCfg (lowercase) object the CLauthor can override. So, some internal call pathways may need an extra parameter here & there.

ckp95 commented 2 years ago

Well here are my thoughts:

As a user of command-line applications, my default expectation is that they ought to work basically the same, at a syntactic level, to all other command-line applications. i.e. if I give an option that doesn't exist, it will just fail immediately rather than be matched to something containing it as a prefix. I know that some CLI programs do this, but they are not the norm.

Also as a CLI user, I don't really care what library the CLI author used to write the parsing logic. It's just an implementation detail. I don't want to have to read the documentation for such a library in order to understand the special ways the program interprets CLI arguments. It wouldn't even occur to me to do so. All that stuff is the responsibility of the CLI author.

Likewise, as a CLI author, I do not expect my users to meticulously check my documentation to see that this special parsing behavior exists. It's hard enough to make users read documentation as it is; I don't want the headache of them complaining to me when they try to use my program and it interprets CLI arguments in a way they weren't expecting.

I really do not expect to be able to, or to have to, configure how my shell parses certain command line options in a special library-specific configuration file, and I don't expect my users to do this either.

I don't think it should even be this way by default. I think it's actually dangerous and has the potential for security vulnerabilities. At the very least, it makes it so I can't safely remove features as a CLI author, nor safely install updates as a CLI user.

Example of the danger:

Let's suppose I have a program my-fancy-program. It has two options, --foo, and --foo-dangerous. People are using the --foo option in scripts. Later I deprecate and remove the --foo functionality and tell users to use --bar instead. But of course, in real life, not everyone pays attention to release notes. So when my users scripts encounter my-fancy-program --foo, they don't just exit immediately with error code 1. Instead, they silently invoke the --foo-dangerous functionality. Cue explosions.

Now yes, this is does seem a little contrived. But the R language has an analagous mis-feature, and it does empirically cause bugs and frustration (see this document, ctrl-f for "partial match").

And yes, as a CLI author it is easy to spot the problem in this particular small-scale case. But if you have a program with a large number of options and subcommands, it may not be obvious that removing something could cause subsequent invocations to be partially-matched to something else. This situation could get overlooked. At any rate it would make me very paranoid about what kind of names I give to the options, to preclude this from happening in the future.

And what is the advantage? We already have tab-completion. All this does is save a single press of the tab key. It's a minuscule benefit, with the potential for disaster. It is an unsafe default.


EDIT: also I didn't see this before, but I don't agree with the characterization of "forcing strictness on others". I see it as respecting the least-surprise principle from the perspective of the user (and also the CLI author who is using the library).

I think a more useful way of accomplishing the same ends, would be to exit with an error if the command doesn't exist, but, print a helpful "did you mean: ..." message. So if I type my-program --fo it would say Did you mean: my-program --foo. That's respectful to the user but refrains from doing anything unless explicitly and exactly told to.

ckp95 commented 2 years ago

Another perspective:

When using CLI tools, one is either using them interactively, or in a script. In an interactive setting, tab completion already exists. In a script setting, it's good practice to be as verbose as possible: use long parameter names instead of short, to aid readability, greppability, maintenance, onboarding, etc. So in the first case this feature is unneeded, and in the second case it's a hindrance.

SolitudeSF commented 2 years ago

i like prefix completion, but there should be compile switches for all these features, just for the quantifiable stuff, like runtime impact and binary size.

c-blake commented 2 years ago

The default prefix matching is not going to change because other users besides you have had this feature for years. While we can maybe add a way for you to block the behavior in all your own CLauthored cligen programs, it will probably never be the out-of-the-box default. I say this only because your very strident tone suggests anything short of perfection might be a deal breaker for you. However, I am not going to maybe inconvenience dozens of other users because you don't like a default. (And that may apply to perhaps many things). But I tend to be accommodating with non-disruptive new features -- when asked nicely.

I (at least) do not have tab completion for subcommands, or long options within subcommands. Further it is fundamentally easier to provide this at a matching level than completion systems which have to be either specifically programmed per command or to parse help outputs which people also want to vary and the completion systems themselves vary with shells (Bash, Zsh, Fish, etc.). The value may not be enough compared to how you personally evaluate the risks, but it is also not miniscule. I assume you are not volunteering to update everyone's completion systems for every shell anytime anyone adds an option, right?

Risk-wise there is an "ability to anticipate dynamics" aspect. Personally, I don't change sets of options very often at all. Heck, personally I don't write many commands with "dangerous" options at all. So, risk tracks CLauthor context - a lot - as well as CLusage. I don't think of R as even a platform people write CLI tools in. Python is more so a CLauthor platform, and I believe it also has this feature.

The "did you mean" feature also exists already, and yeah, my view for how to add this should retain that. Unlike many toolkits, cligen commands come with (by default) a --help-syntax. Don't know what to say about user impatience with docs or general impatience about everything.

I agree interactivity is different from scripting and have probably described this feature in those terms before. I know multiple cligen users that like the colorized help and edit their configs. In my view, giving CLusers power respects them rather than disrespecting them.

FWIW, I doubt object size will be impacted much, but it's possible (a when switch as @SolitudeSF suggests would have more of a chance of that).

Anyway, I had just started to work on adding this feature for you until stopping to write this message, but now I don't have any more time today.

ckp95 commented 2 years ago

I understand the desire not to break people's workflows. Believe me, I wouldn't suggest such a thing lightly, especially since I'm just a drive-by commenter who hasn't even used this yet.

But yes, I would appreciate if there was some kind of way to turn it off. Failing that, a warning / reminder to end-users in the help text, that this behavior exists. So that there is at least some kind of heads-up. e.g. using the example from the docs, something like this:

Usage:
  fun [optional-params] [args: string...]
An API call doc comment
Options (note: partially-typed option names will be expanded if the meaning is unambiguous):
  -h, --help                    print this cligen-erated help
  --help-syntax                 advanced: prepend,plurals,..
  -f=, --foo=    int     1      set foo
  -b=, --bar=    float   2.0    set bar
  --baz=         string  "x"    set baz
  -v, --verb     bool    false  set verb

I still think you're underestimating the risks, but hey it's your library so it's your call.

Heck, personally I don't write many commands with "dangerous" options at all.

It doesn't necessarily have to be explicitly "dangerous", that was just for expository purposes. It suffices that it does something unwanted and unanticipated wrt the unintentional prefix form.

I don't think of R as even a platform people write CLI tools in.

That example isn't about CLI tools, it's about key lookup in their list datatype. But it's the same idea.

Python is more so a CLauthor platform, and I believe it also has this feature.

Well damn, now I'm not going to be able to sleep tonight.

c-blake commented 2 years ago

Ah. I hadn't clicked through. Thanks for the clarification.

The feature is mentioned in the default --help-syntax already mentioned. I could maybe ALL CAPS part of that if you want.

I honestly don't think you/I are in a position to estimate the obviously context dependent risk. Only CLauthors/CLusers are. (EDIT: IIRC, ) The first thing SolitudeSF used this for was creating PNGs which, you know, generally have a human in the loop interpretation-wise.

cligen gives a lot more Power To the CLusers than almost any similar package I know, and so, yes that might surprise, but to me it respects them more. Put in some learning effort - get out power. They can actually change what --help-syntax prints out if they want to "remind themselves" differently than the default, such as, say making the feature you worry about all caps inverse blinking text.

ckp95 commented 2 years ago

Okay, I think we're coming at this from different mindsets and this wouldn't be suitable for me. I tend to write automation scripts that don't have a human in the loop and need strict, predictable interfaces. So I don't think this will be the library for me. But thank you for the thoughtful replies in any case.

c-blake commented 2 years ago

I guess I would also point out that, even by default as-is (and I am not saying "no" to this one feature and will likely write it soon), a paranoid CLuser can always use the full option name and some automation via git/VC history could block it from ever being introduced as a prefix to a previously removed option by accident, possibly aided by a strict rule for how you format procs wrapped by cligen or something that dumps generated help upon every commit or etc. While the problem is theoretically possible, I doubt it is super frequent and you can get ahead of it.

But as I'm sure you're aware, semantics can also always change on updates to a CLI tool and there can be trouble if a user is used to the old semantics or does not know which version they are running. And not everyone uses "--" before every glob. And few quote every variable the right way or always use find -print0|xargs -0 or whatever for truly arbitrary filename processing. Etc., etc. The entire subculture is unsafe. Once you start being very careful, the allure of shell syntax fades a lot, IMO.

But! This is another bonus to the cligen approach - you can prototype something as a command in a shell, but when you want to do safer automation programs as you say you do a lot, you can instead import the same functionality but use a more careful-by-default Nim program/script (.nim or .nims)..Sharing the code thoroughly between testing prototype CLI and production automation. This may be more practical than you might guess. Nim is pretty succinct.

In Nim, a default flipped from what you want doesn't mean you have to repeat yourself. You can come up with all your own defaults and automate that at the compile-time level with wrapper macros or include files just once for all your tools. I have noticed really paranoid people sometimes prefer hyper-explicit repetition rather than centralization of settings. As I said initially - if many behaviors are going to be a battle, it may not be for you.

c-blake commented 2 years ago

Oops - I forgot that I already added this feature 2 years ago. It's called longPfxOk for long options and stopPfxOk for subcommands. The easiest way to activate it is to just say clCfg.longPfxOk = false after import cligen and before dispatch. (In my defense, it was mentioned at the bottom of the issue I linked to immediately. I didn't click through my own link. You kind of caught me at a distracted time.)