Request: Show flags with dashs, not underscore, by default

pb-cdunn commented 5 years ago

fc_consensus -h
Usage:
  main [optional-params]
  Options(opt-arg sep :|=|spc):
  -h, --help                                    print this cligen-erated help
  --help-syntax                                 advanced: prepend,plurals,..
  -m=, --min_cov=               int     6       minimum coverage to break the consensus
  --min_cov_aln=                int     10      minimum co...

See? The cligen-provided --help-syntax does it right. But my own arguments all have _ by default. Python argparse accepts both but displays - by default.

Could you make - the default for display? It's more POSIXy and easier to type.

c-blake commented 5 years ago

Yeah. We can have some kind of transform to user-presentation thing we do. snakeCase -> kebabCase (_ -> -) is the easiest case. Should probably do the same thing for enum values, too.

I actually thought of this already, but held off because I was trying to figure out more general (camel,snake,kebab) -> (camel,snake,kebab) transforms. There are (at least) camel->kebab edge cases like where the capitals refer to an acronym/abbreviation you want to leave capitalized, e.g., you don't want --useDNA transformed to --use-D-N-A, however you would want it for --useABalance -> --use-a-balance. It seemed some things might get too deep into natural language perfection which seemed..out of scope for the time I can put into this.

Anyway, no reason for perfect to be the enemy of the good. In the Nim identifier space a mix of camel and snake parameters to the same proc seems not super likely. So, confusing the end user seems unlikely. I only very recently put a line in --help-syntax for users to even know about identifier spelling choices. Happy to accept a PR if you're impatient, but I should get to this in the next couple of days.

pb-cdunn commented 5 years ago

To me, _ -> - is most important. I don't know what to do about camelCase. Personally, I hate camelCase anyway. In LISP, you can actually use hyphens in the language, so you don't have to use _ either.

But the Nim std library uses camelCase, right? So I guess you've got to do something with it. I think you've got the right idea:

camelCase should accept --camelCase and --camel-case
- --camel_case is fine too, or not. Whatever is easier to code.
useDNA should accept --useDNA and --use-dna
- --use_dna too? Maybe not.

And in my opinion, --camel-case and --use-dna should appear in the help docs, not the others. Encourage POSIX. Maybe the "advanced" docs can mention the other possibilities.

Your Nim-macro-fu is much stronger than mine. I'll have to wait for your implementation.

c-blake commented 5 years ago

I agree. The starting point is (ever and always) the literal, non-Nim normalized spelling of the proc signature I get from getImpl. Right now every spelling is accepted..the only question is how to render in help or error messages.

I will probably just do a userCase proc (docCase? helpCase?) that CLI authors can override if they want (like mergeParams between import cligen and dispatch) and have the default impl only do snake -> kebab.

Then in some extreme case, they can maybe special case things on a per string basis, like case inp of "useABalance": "use-a-balance". It might make sense to provide a scope of how the string is used (e.g., longOpt, enumVal, subCmd). That context might disambiguate things for maybe the handful of people who ever wrote their own could use to behave more specifically. So, maybe proc helpCase(context: NewEnum, ident: string): string. If we come up with a good rule for camel -> kebab, it's just a simple patch to that default proc and we can get Nim ident caMel -> help text ca-mel behavior.

I don't really care much about the name of the proc if you have some suggestion. It sounds like you'll mostly like the default anyway. So, you may not ever even want to define your own. Anyway, such a definition will just be a string to string transform with no macro fu required. (Working in the calls to helpCase is probably best left to me, though.)

c-blake commented 5 years ago

(And the other possibilities are already the 3rd --help-syntax bullet point. So, you can enter your kebab case right now with no changes if that makes your fingers happy. The help text is just pinned to the casing in the Nim code.)

pb-cdunn commented 5 years ago

I thought I'd try something fancy. I cannot name my proc foo-bar. I mean I can, but cligen will not accept that.

  cligen.dispatchMulti(
    [`filter-me`,

        ... pbdataset.nim(19, 23) template/generic instantiation of `dispatchMulti` from here
        ... pbdataset.nim(20, 6) template/generic instantiation of `dispatchMultiGen` from here
        ... ../../nim-falcon/nimbleDir/pkgs/cligen-0.9.19/cligen.nim(818, 59) template/generic instantiation of `cligenQuitAux` from here
        ... ../../nim-falcon/nimbleDir/pkgs/cligen-0.9.19/cligen.nim(657, 3) Error: undeclared identifier: 'dispatchfilter'
        ...     quote do: cligenQuit(`disNm`(mergeParams(`mergeNms`, `cmdLine`)),
        ...     ^

And if I use a dash in a parameter name, it gets truncated:

proc filter(positional: seq[string], `foo-bar`: int): int =
...

% prog filter --help
  -f=, --foo=    int  REQUIRED  set foo

Note that if you display --foo-bar in the docs, people will know that works. They might still try --foo_bar, and it will still work. But if you don't display --foo-bar, then nobody will guess it. I want to encourage people to stop pressing the SHIFT key all the time.

I really want kebab-style in the docs. Sure, cligen.style="kebab" would be fine, if it's not the default.

Here is the GNU convention:

https://www.gnu.org/software/libc/manual/html_node/Argument-Syntax.html

Option names are typically one to three words long, with hyphens to separate words.

c-blake commented 5 years ago

As I said, I absolutely agree the default for helpCase should be kebab output. No need to convince me on that! I just have to sit down and do it. Users can put the dashes in wherever they want, of course, as they can already. And helpCase will allow help/error messages show whatever format the CLI author wants, as discussed, but (kebab-casey by default).

I honestly never really considered Nim procs with backquote-based arbitrary names with dashes when I created the dash-collapse normalizer. That's a good catch!

Honestly, that will probably wind up being a documented limitation upon what proc can be wrapped (with hopefully a better error message). I mean, `--help` itself is also a valid Nim proc ident, but how would that work in multicommand that wasn't very confusing? Probably best to just disallow dashes in proc names from being supported as auto-wrap targets. This also impacts proc parameter names and probably enum value names, too, all of which would create a lot of confusion between command-option-dash syntax and identifier syntax. It might be resolvable in some theoretical sense, but it would probably be a lot of potential to confuse CLI users.

I realize inside vs leading dashes are pretty different in this confusion regard. I'm not against the arbitrary support if someone wants to add it..It just kind of feels like a ton of effort with little payoff since in the end I want users to be able to put in new dashes wherever they prefer anyway (at least as long as Nim stays Nim-style-insensitive). A CLI author can always wrap a dashed-name with an undashed one in their Nim code and call dispatch on the undashed version, after all. I think we agree that CLI users doing less SHIFT pressing many times matters more than the Nim CLI author doing a paper thin wrapper once.

It does work to put a dash in the cmdName for the help message and name/rename your output binary appropriately. Whatever command shell runs the program will not allow dashes to be anywhere in the command name the way most of the cligen parsing will, but that's just PATH search outside my control. So, once I do helpCase you should be able to generate commands where everything is kebab.

pb-cdunn commented 5 years ago

Really, if you just show "kebab"-style in the help-message, that's enough for me. Down the road, more powerful features (like arbitrary characters in proc-names) could be cool, but cligen already works great for me.

I'll experiment with multi-level dispatch someday. Not a big deal if it doesn't work though.

c-blake commented 5 years ago

Ok. I just pushed something that may satisfy your immediate need. Converts snake case params to kebab for options help table and for misspelling suggestions.

Rendering of strings for the help table is done in argcvt.nim while the error message generation is in cligen.nim. So, I made helpCase.nim an include in argcvt imported by the generated parser. I think that allows someone to drop a helpCase.nim in their project directory/ahead of cligen in their path to override the definition (e.g. if they want a more ambitious camel -> kebab conversion), but I haven't tested that yet. You seemed eager. So, I thought I'd do something minimal ASAP.

It seems that optionNormalize is not called already for subcommands in a dispatchMulti setting. So, that's a separate bug I didn't know about. I think you could do a bunch of cmdName= settings for each subcommand to get a less flexible effect like you desire, but hopefully I will fix this bug soon, too. I'm leaving this issue open to remind me to finish these bits.

c-blake commented 5 years ago

So, yeah, the multi-command/subcommand stuff is pretty far from being able to be CLI-style-insensitive to support kebab-case, as I had originally planned. parsopt3 even needs fixing to apply optionNormalize to stopWords. I still plan to fix this, but it will take longer than my initial estimate to get all the fiddly bits right on that. (Should probably be fixed before you get around to multi-level dispatch, though.)

pb-cdunn commented 5 years ago

% ./pb kmers -h
kmers [optional-params]
  Options(opt-arg sep :|=|spc):
  -h, --help                             print this cligen-erated help
  --help-syntax                          advanced: prepend,plurals,..
  -d=, --int-dummy=     int     42       set int_dummy
  -s=, --string-dummy=  string  "hello"  set string_dummy

Perfect! You can make a separate ticket for more complicated improvements. Closing

c-blake commented 5 years ago

Cool.

c-blake commented 5 years ago

Ok. kebab-case (or really any CLI-style-insensitive matching) now works with subcommand names.

You have to manually change cmdName to have the dash if you want that in the help output, but that does seem to work. See test/FullyAutoMulti.nim.

pb-cdunn commented 5 years ago

You have to manually change cmdName to have the dash if you want that in the help output, but that does seem to work. See test/FullyAutoMulti.nim.

Could you give me an example? It's very important for me that the defaults Do The Right Thing, which means dashes.

c-blake commented 5 years ago

Um..I did give you an example. test/FullyAutoMulti.nim.

pb-cdunn commented 5 years ago

Cool. I just experimented with various command-names. The default looks good, and it's amazingly fault-tolerant to typos (not a requirement, but very cool).

c-blake commented 5 years ago

I was also impressed with the fault tolerance. It really creates almost a continuum from "short to long" {option keys, enum values, subcommand names} very much in the spirit of short vs. long options but more general. I think this unambiguous prefix matching + spelling suggestions + Zsh compdef _gnu_generic creates a really first class command-line user experience with great ergonomics to both CLI authors & CLI users. After just a very short while using it, I find myself wanting to re-write various C commands in Nim just to get that! Anyway, I'm glad we both like it and your discussion with me prompted me to sit down and do that long-thought-of feature.

I expect someone to complain that it is too fault tolerant and ask for a switch to turn it off, the way short={"":""} turns off short options. You were the one who asked for that short-suppressing feature, but maybe that was more about the help message? Given that right now you are allowed but not required to spell it all out (with kebabs/camel/snake mixtures even) and that any ambiguity errors out, this strictness is a very low priority to me. It's more like a debugging mode. As far as I know, gdb and gnuplot both have no way to deactivate sloppy mode after more than 30 years of public hacking on them. There may be internet archives to assess likely popularity of such a request.

If/when an off-switch does happen, an environment variable may be the best way to decide "dynamically interactive sloppy token" mode vs "spelled out script" mode. Then in a script context, where you want to force yourself to spell everything out, you can export CLIGEN_STRICT or something at the top, but share the same executable as sloppy mode. Many script authors might want it off initially anyway, so they can cut&paste terminal commands into a script before trying to make the script more explicit. So keying sloppy mode off isatty() (the way many colorizing modes do) seems wrong. Arguably, the short={"":*} you asked for is similar and could/should perhaps share the run-time switch of any eventual feature along those lines instead of its current "static" specification. Not sure how you feel about that. Users who get confused by both short&long keys in the help message can just export CLIGEN_STRICT=short,long,enum,subcmd (or whatever) in their shell RC set ups and share binary executables with other users who love short options for e.g., their combining into one argv[] slot powers. Changing allowed behavior a lot based on environment and/or config can also confuse, though. There may be no great answer here.

c-blake / cligen

Request: Show flags with dashs, not underscore, by default #97