pacak / bpaf

Command line parser with applicative interface
Apache License 2.0
352 stars 17 forks source link

Naming is confusing for a new user (at least this one) #50

Open epage opened 2 years ago

epage commented 2 years ago

Browsing the docs, the API feels very abstract and I have a hard figuring out what the different API items are and what they are for

This is a follow up to my reddit comment.

I tend to look first at the types of a crate. In doing so, I see a "Named" and "Positional". I think just adding an Arg suffix would be a big help here (NamedArg, PositionalArg).

Then if I look at an example, I might see something like

    let speed = long("speed")
        .help("Speed in KPG")
        .argument("SPEED")
        .from_str::<f64>();

long, without an antecedent feels incomplete. I normally think of "argument" as an element in std::env::args_os, a positional argument, or --speed SPEED, not the value being passed in. Personally, I would recommend the example look like

    let speed = long_flag("speed")
        .value("SPEED")
        .help("Speed in KPG")
        .from_str::<f64>();

Both the naming change and the order clarifies what this is and how its used. EDIT: Oh but the order is required for converting it into a parser. That is unfortunate.

I see in the description for Named, a -speed SPEED is being referred to as an argument. If there is a distinguishing name, normally I would expect it to be "option".

I also found the distinction between switch and flag confusing. It sounds like flag allows storing any type and switch is a specialization of a flag specifically for bools? One way of clarifying that relationship is by taking the text "Simple boolean flag" and adding a link so its Simple boolean [flag][Named::flag] as this makes it clear that "flag" is specifically referring to the same concept as the flag function.. Speaking of, it seems more intra-doc links could generally exist (I noticed long doesn't have them).

Some other quick naming thoughts

At this point I'm stopping my read through, so can't say whether there are more or if that is the extent of it.

pacak commented 2 years ago

I tend to look first at the types of a crate. In doing so, I see a "Named" and "Positional". I think just adding an Arg suffix would be a big help here (NamedArg, PositionalArg).

I didn't pay much attention to those names because users don't need to intact with them directly, but I see your point - documentation aspect is going to be better. Will rename on the next version bump.

long, without an antecedent feels incomplete.

Possibly, I tend to think of them as assembling them in terms of natural language: "i want extra spicy, with extra rice to the side, Kou Shui Ji" or whatever. Fits well with some languages, still valid in English. Putting everything into prefix leads to combinatorial explosion: we have switches, flags, required flags, positonals, etc. And some of them can be _os variants.

I'm open to suggestions to how to convey this idea better. Probably better blurb in a tutorial?

I see in the description for Named, a -speed SPEED is being referred to as an argument. If there is a distinguishing name, normally I would expect it to be "option".

https://docs.rs/bpaf/latest/bpaf/struct.Named.html

I tried to explain terminology here. Things with a name could be flags, switches or arguments. --speed SPEED is an argument because it takes a value. I think I tried to stick to terminology used by gnu getopts but I'm open to any suggestions.

I also found the distinction between switch and flag confusing. It sounds like flag allows storing any type and switch is a specialization of a flag specifically for bools?

Right, this is what it is. I'll update the docs, thank you for the suggestions.

From its name, I still don't feel like I know what req_flag does

Documentation tries to explain the behavior: https://docs.rs/bpaf/latest/bpaf/struct.Named.html#method.req_flag

It's a flag that only parses when it's actually present which allows composing them with construct([a, b, c]): for that to succeed any one of those flags needs to be present. You can build on top of that with .optional() or .many().

Since there's only a few names in the API I'm not trying to put full meaning into the name itself, rather making so they are easy to type. The alternative name I was thinking of is a required_flag but this one is more annoying to type and pronunciation rules to how to type it right might not be obvious for a non-native speaker. Will see if I can come up with anything better.

to_options always confuses me, in part because I miss the "s", in part because I think of Option since this is Rust, and in part because its a descriptor without the rest of the context.

Probably not the best possible name, I agree. We'll see if I can come up with something better.

Personally, I would rename Parser to SubParser and OptionsParser to Parser (I also always saw Parser and thought that meant top-level).

Hmm... Maybe, but this naming seems more natural in Haskell. Parser is a proper Applicative you can compose, while OptionParser is just an object you can run or transform a bit, Parser is a very common name and most of them are monads or applicatives. Need to check how other libs are handling this in Rust.

At this point I'm stopping my read through, so can't say whether there are more or if that is the extent of it.

Thank you for your feedback, much appreciated.

epage commented 2 years ago

Possibly, I tend to think of them as assembling them in terms of natural language:

The problem is that there is nothing prefacing long to give the reader context, so its backwards.

Putting everything into prefix leads to combinatorial explosion: we have switches, flags, required flags, positonals, etc. And some of them can be _os variants.

I wasn't thinking you needed factories for every combination. One route is to just use "flag" to refer to the named portion of the argument so you have short_flag and long_flag. I would probably do pos_arg for positionals. You could move the os to be a function on the Positional type, though you lose symmetry with argument / argument_os.

I tried to explain terminology here. Things with a name could be flags, switches or arguments. --speed SPEED is an argument because it takes a value. I think I tried to stick to terminology used by gnu getopts but I'm open to any suggestions.

Users easily get confused by terminology like this, even if it comes from common precedent. We found this frequently with clap and we've been iterating on it to have less need for there to be something that needs defining.

Also "argument" is a very overloaded word. I'm not sure which part of getopts documentation you were looking at but my quick glance at some it sounded like it talking in terms of being an argument muncher. When you separate from that context, "argument" is being used in multiple distinct ways and can be confusing.

Since there's only a few names in the API I'm not trying to put full meaning into the name itself, rather making so they are easy to type. The alternative name I was thinking of is a required_flag but this one is more annoying to type and pronunciation rules to how to type it right might not be obvious for a non-native speaker. Will see if I can come up with anything better.

Sounds like req_flag is the equivalent of clap's required. Can "arguments" and "positionals" not also be required? I think the fact that it had a qualifier on it and the qualifier was specifically about flags is what confuses me.

Hmm... Maybe, but this naming seems more natural in Haskell. Parser is a proper Applicative you can compose, while OptionParser is just an object you can run or transform a bit, Parser is a very common name and most of them are monads or applicatives. Need to check how other libs are handling this in Rust.

"natural in Haskell" should be a red flag :). In my working with CLIs across ecosystems, when I see a "parser" without any qualifiers, my immediate thought is "argument parser" or "the thing I use". This is why I made sure that clap's ValueParser (the equivalent of from_str, guard, map) has a qualifier on it.

pacak commented 2 years ago

The problem is that there is nothing prefacing long to give the reader context, so its backwards.

True, but usually context is within 2-3 words - you can have short name, long name, env and help. More if you start playing around with aliases. Derive macro uses the same approach - short, long, etc. I think even clap does that in derive: https://github.com/rosetta-rs/argparse-rosetta-rs/blob/main/examples/clap_derive-app/app.rs#L6

I'm simply trying to keep combinatoric and derive APIs close to each other when possible. Anyway, something to keep in mind.

One route is to just use "flag" to refer to the named portion of the argument so you have short_flag and long_flag

For the top level functions to create Named? Maybe. Adds more names to the API, breaks the symmetry between short('f').long("flag") and long("flag").short('f') and also with the derive API. But I see what you mean. Probably.

You could move the os to be a function on the Positional

Hmm.... :heart: I think I can totally make it work and do the same for positional too. Wasn't possible in pre 0.5.x which explain why there are _os and non os versions. Great idea :heart: loving it :heart:

I'm not sure which part of getopts documentation you were looking at but my quick glance at some it sounded like it talking in terms of being an argument muncher.

To be honest it comes from me reading it a while ago and mostly relying on my memory :)

Okay, doing one more pass over terminology is important then. Any reading you can recommend?

Can "arguments" and "positionals" not also be required?

They are required, but they also require a value. Flags/switches are optional by default. Present - parses as one value, absent - parses as a different value. req_flag gives you tri-state switches and similar scenarios when you want to get one out of several flags (value-less arguments?) from the user.

"natural in Haskell" should be a red flag :)

Hmm... I think you convinced me. Do you mind if I ping you to review the terminology/naming once I have it?

epage commented 2 years ago

They are required, but they also require a value. Flags/switches are optional by default. Present - parses as one value, absent - parses as a different value. req_flag gives you tri-state switches and similar scenarios when you want to get one out of several flags (value-less arguments?) from the user.

Ok, so req_flag is an alternative to flag / switch used when composing larger relationships. It could probably have a more explicit link (like literally a link) with flag with some further clarification (or wording tweaks).

Do you mind if I ping you to review the terminology/naming once I have it?

Go for it

epage commented 2 years ago

A little off topic but I've taken a crack at a new clap/bpaf comparison and would appreciate feedback to improve accuracy and reduce inherent bias.

Rather than a blog post or people only finding it when browsing one of our repos, I decided to explore putting it in the argparse-rosetta-rs repo as a more neutral and evergreen place.

maurges commented 2 years ago

Re method names: I personally found the short names fine, because it's the same way they are in haskell's optparse-applicative. In fact, I still haven't read the docs for bpaf and am still relying on my haskell knowledge. But your point that seeing just let opt = long("foo") is confusing does hold, since in haskell they are prefixed by option declaration.

epage commented 2 years ago

Re method names: I personally found the short names fine, because it's the same way they are in haskell's optparse-applicative.

I'd caution against the bias of "it worked this way in haskell" if the hope is to have a wider reach than Haskell developers as Haskell approaches to problems aren't always intuitive to the wider programming audience.

pacak commented 2 years ago

approaches to problems aren't always intuitive to the wider programming audience.

Right, I'm keeping this ticket open to collect thoughts from non Haskell users.

Also a question - is short('foo').help("bar").argument("name") that much different from items.iter().filter(|i| i % 2 == 0).sum()? In both cases you know you are starting a chain of some sort but you don't know what you'll get at at the end. That's where rust looking inspiration for parsing comes from at least.

I have a parser for interactive command line that uses more Haskell-like approach https://github.com/pacak/omnomnomicon/blob/master/src/tutorial.rs - I'll probably end up rewriting it before releasing.

scottlamb commented 2 years ago

I'm just learning of bpaf and haven't properly tried it out yet, but since you're soliciting fresh eyes:

Also a question - is short('foo').help("bar").argument("name") that much different from items.iter().filter(|i| i % 2 == 0).sum()?

IMHO, yes, the "rules" are looser for the latter in a couple ways:

My first impression is this isn't a deal breaker for using the crate or anything, but long isn't as clear as it could be.

ematipico commented 1 year ago

Here's some thoughts about names and APIs. No particular order:

It would be nice to have some APIs to stop the parsing of child types. Like serde with its skip macro. I've had a case where I have a big type, I would have preferred to stop the parsing/ignore subsequent data structures.

I hope it helps!

pacak commented 1 year ago

initially I didn't understand what was supposed to be the string for argument, it took me a bit to figure it out. I think it's worth some explanation a la ELI5

Will add something, good point.

flag doesn't work very well with Option<bool> and maybe it's meant to be that way

Problem with Option<bool> is that it can take 3 possible values - None, Some(true) and Some(false) and unless you really want to allow user to express all those values and deal with them yourself - you don't need it. If you do - you need to expose at least two different command line flags to user, let's say --optimize and --no-optimize with None being absence of those flags. For something as fancy as that you probably want a custom datatype anyway, something like this

#[derive(Bpaf, Debug, Clone)]
enum Optimization {
    #[bpaf(long("optimize"))]
    /// optimize
    Yes,
    #[bpaf(long("no-optimize"))]
    /// no optimize
    No,
    #[bpaf(skip)]
    Unspecified
}

And even that I'm not sure if it's needed, I'd use dual state + fallback so not to deal with Unspecified branch in the code. flag works with two possible states - value is present and value is absent.

I suspect the idea of making everything Option comes from other parsers that give you a huge product type of some sort and Option breaks it down into virtual sum types. With bpaf you just get what you are going to use...

It would be nice to have some APIs to stop the parsing of child types. Like serde with its skip macro.

Problem here is that bpaf needs to create a structure along with all the children out of command line arguments. I guess skip can mean that this field must implement Default.... Will try to add this.

I hope it helps!

It helps a lot, will try to update the documentation to address those points.

ematipico commented 1 year ago

Yeah I thought as much for Option<bool>, at the end the data structure I have needs to fit various cases/crates, so I had to be creative. I like the result though!

I guess skip can mean that this field must implement Default.... Will try to add this.

I believe that would work too! If the default value happens to be None and bpaf manages to stop the parsing of subsequent types, that would yield the result I desire. Although, I want to highlight that my case might be an exception, so I don't want to force my situation on the library!

The library is amazing: simple and efficient, good documentation and gives very useful features!

pacak commented 1 year ago

If the default value happens to be None and bpaf manages to stop the parsing of subsequent types, that would yield the result I desire.

FWIW you can already achieve this result by using pure or pure_with to define a parser that simply returns a value you want and external to plug this parser into the tree.