pacak / bpaf

Command line parser with applicative interface
337 stars 17 forks source link

Command with `many + req_flag` and `adjacent` still seems to conflict with top-level flags? #325

Open polarathene opened 8 months ago

polarathene commented 8 months ago

I've tried to navigate the docs and look for other examples of how to approach this, or if it's valid to do at all.

adjacent annotation is meant to prevent the conflict by establishing a parsing scope where the unrelated flags cannot overlap? (_docs seem to suggest this is implicit for commands to consume everything afterwards too_)


Reproduction

My approach with an enum AFAIK implicitly uses the NamedArg::req_flag "Required Flag" which seemed appropriate for this functionality. I have a group of enum variants to collect from that are treated as switches.

use bpaf::Bpaf;
use strum::{EnumString};
std::collections::HashSet

#[derive(Bpaf)]
#[bpaf(options, version)]
struct Args {
  /// Port to bind
  #[bpaf(short, long, argument("PORT"), fallback(80), display_fallback)]
  port: u16,

  /// If present within the permitted set, add the required capability into the effective set
  aware: bool,

  #[bpaf(external, optional)]
  inspect: Option<Inspect>,
}

// adjacent here seems to have no effect?
#[derive(Bpaf, Debug)]
#[bpaf(command("inspect"), adjacent)]
/// Show the capabilities available to the process
struct Inspect {
  #[bpaf(external(capset), collect, group_help("Capability sets to display:"))]
  sets: HashSet<CapSet>,
}

#[derive(Debug, EnumString, Hash, Eq, PartialEq, Bpaf, Clone)]
enum CapSet {
  #[bpaf(short, long)]
  Ambient,
  #[bpaf(short, long)]
  Bounding
  #[bpaf(short, long)]
  Effective,
  #[bpaf(short, long)]
  Inheritable,
  #[bpaf(short, long)]
  Permitted,
}

fn main() {
    let args = args().run();
    if let Some(caps) = args.inspect {
      println!("{:?}", caps.sets);
    }
}

Initial failure

$ cargo run -- inspect -ebp

Error: capability-awaresupports -e as both an option and an option-argument, try to split -ebp into individual
options (-e -b ..) or use -e=bp syntax to disambiguate

image

Seems to be some formatting issues with the error:

If I follow the error advice, it is parsing the -p for inspect as the global/shared flag -p / --port, instead of --permitted:

$ cargo run -- inspect -e -b -p

Error: -p requires an argument PORT

As a workaround, I can avoid the short for --port, or introduce a 2nd subcommand (as --port + --aware are for the default command and have nothing to do with my inspect subcommand).

I get the impression that these command docs (OptionParser) might be trying to indicate the issue I'm running into and how to avoid it, similar to the common Output example I've seen elsewhere in the docs that suggests using adjacent.


New observations

# Not how I would want to use `--aware` / `--port`, but interesting error:
$ cargo run -- inspect -e -b -p --aware -p 42

Error: -p requires an argument PORT, got a flag --aware, try -p=--aware to use it as an argument

Using adjacent makes no difference to that, and I don't appear to be able to annotate adjacent anywhere else (fails to compile).

adjacent with subcommand inspect at the top of Args

# `adjacent` is useful here if `inspect` is positioned at the top of Args (otherwise no difference)
# #[bpaf(command("inspect"), adjacent)]
# With inspect at top of Args

# Regardless of `adjacent` annotation:

# subcommand `inspect` must come first like in Args struct:
$ cargo run -- --aware inspect -e -b -p
Error: -p requires an argument PORT
$ cargo run -- --aware inspect -e -b
Error: inspect is not expected in this context
$ cargo run -- --port 42 inspect -e -b
Error: inspect is not expected in this context

# Ambiguous:
$ cargo run -- inspect -e -b -p -p 42
Error: 42 is not expected in this context

# Out of curiousity, an unsupported arg:
$ cargo run -- inspect -e -b -p -k 42
Error: -k is not expected in this context

# Without `adjacent` annotation:

$ cargo run -- inspect -e -b -p --aware
Error: --aware is not expected in this context
$ cargo run -- inspect -e -b -p --port 42
Error: --port is not expected in this context

# With `adjacent` annotation to `Inspect` struct/command:

# Non-ambiguous switch signals context has changed:
$ cargo run -- inspect -e -b -p --aware -p 42
# success

# Non-ambiguous arg via long name:
$ cargo run -- inspect -e -b -p --port 42
# success

adjacent with subcommand inspect at the bottom of Args

Interestingly, port: u16 may have something to do with the behaviour/errors due to the fallback instead of an Option type?

I noticed that if I give --aware a short of -a, this is always being treated as --aware, even for inspect -a (unless -a was set prior already):

# These have `inspect` at the bottom of Args, with `adjacent` annotation set on Inspect struct

# aware is set as expected:
$ cargo run -- -a inspect -e -b
{Effective, Bounding}

# No conflict with aware, and working:
$ cargo run -- -a inspect -eba
{Ambient, Effective, Bounding}

# aware is true, while ambient is not set (bad):
$ cargo run -- inspect -eba
{Effective, Bounding}
$ cargo run -- inspect -e -a
{Effective}

# Fails to parse when repeating the short:
$ cargo run -- inspect -ebaa
Error: argument -a cannot be used multiple times in this context

# Is it kind of detecting the issue here when flags are split apart?
$ cargo run -- inspect -e -a -a
Error: flag -a is not valid in this context, did you mean to pass it to command inspect?

# Non-conflicting can be used multiple times though?:
$ cargo run -- inspect -ebb
{Effective, Bounding}
$ cargo run -- inspect -e -b -b
{Effective, Bounding}

# Without `adjacent` annotation set:

# aware is also true in both cases
$ cargo run -- inspect -e -a -a
{Effective, Ambient}
$ cargo run -- inspect -eaa
{Effective, Ambient}

That sort of behaviour with adjacent seems worse? port / aware are still grabbed eagerly despite being defined in Args earlier?

Due to aware being a switch, it is more subtle than the failure I had with port (which while optional, assumes a value when specified)


Help output

If helpful for context :)

$ cargo run -- -h

Usage: capability-aware [-p=PORT] [--aware] [COMMAND ...]

Available options:
    -p, --port=PORT  Port to bind
                     [default: 80]
        --aware      If present within the permitted set, add the required capability into the effective set
    -h, --help       Prints help information
    -V, --version    Prints version information

Available commands:
    inspect          Show the capabilities available to the process

$ cargo run -- inspect -h

Show the capabilities available to the process

Usage: capability-aware inspect (-a | -b | -e | -i | -p)...

Capability sets to display:
    -a, --ambient
    -b, --bounding
    -e, --effective
    -i, --inheritable
    -p, --permitted

Available options:
    -h, --help         Prints help information

It seems the annotation is working like this adjacent (ParseArgument) instead of this adjacent (ParseCon)?

The intent was to allow the subcommand to map a set of switches to unique enum variants that will be iterated through.

The solution I came up with above is a bit awkward looking, so it might be the wrong way about it? I could try it with bool switches and add some logic to create the HashSet with enum variants manually, but I'm not sure if that'd look or work any better 😅

pacak commented 8 months ago

There's a lot to unpack here, I'll go over each point at a slower pace and make corresponding tickets. adjacent in the current state is not perfect but your post gave me some ideas to how to improve it.

What's your final goal you are trying to achieve? Add an optional command inspect (-a | -b | -e | -i | -p)... that takes some extra optional flags and collects them into a set?

polarathene commented 8 months ago

Add an optional command inspect (-a | -b | -e | -i | -p)... that takes some extra optional flags and collects them into a set?

Yes. I'll have a github repo with the small project available soon, if you'd like I can link it here when ready.

What's your final goal you are trying to achieve?

For the enum related feature, each variant represents a different Linux capability set.

I wanted the inspect subcommand to default as a fallback to all the variants (presently it's an empty collection and I handle that myself afterwards), but allow for the user to use flags for the variants to only list the capabilities of requested sets.

fn main() {
    let args = args().run();

    if let Some(Inspect { mut sets }) = args.inspect {
      // Default fallback, strum feature used to get a collection of all variants
      if sets.len() == 0 {
        sets = CapSet::iter().collect();
      }

      // Call enum method to display the capabilities of each set (variant):
      for set in sets {
        set.list();
      }
    }
}
$ cargo run -- inspect -bep

Bounding: ["CAP_AUDIT_WRITE", "CAP_CHOWN", "CAP_DAC_OVERRIDE", "CAP_FOWNER", "CAP_FSETID", "CAP_KILL", "CAP_MKNOD", "CAP_NET_BIND_SERVICE", "CAP_NET_RAW", "CAP_SETFCAP", "CAP_SETGID", "CAP_SETPCAP", "CAP_SETUID", "CAP_SYS_CHROOT"]
Effective: []
Permitted: []

The features of bpaf work really well for that, reducing extra effort to map a struct of bool switches to each variant and produce a collection manually 👍


I have chosen to avoid short usage with --port + --aware which avoids the buggy behaviour examples I provided.

If I want the shorts to work, I'll refactor to a command enum for Args and move those two flags to their own subcommand which will also avoid the parser getting confused (especially with that inspect -a example, that instead set --aware instead of --ambient).

polarathene commented 8 months ago

Off-topic bpaf feedback

It would be neat if the enum itself could be annotated with defaults, since you can see that I repeated the short/long annotation for each variant. Just a minor cosmetic improvement, not too important :)

#[bpaf(short, long)]
enum CapSet {
  Ambient,
  Bounding
  Effective,
  Inheritable,
  Permitted,
}

Likewise, while trying to understand how to implement this with bpaf through the docs, I had trouble with defining the inspect command and getting that collection of req_flag (implicit) setup.

I get the impression that the intermediary struct Inspect may not be necessary, but recall I could not set command("inspect") within Args and refer to the enum for req_flag functionality? Might just be a documentation issue with Derive (I haven't tried combinatoric yet).

struct Args {
  // ...

  #[bpaf(external, optional)]
  inspect: Option<Inspect>,
}

#[bpaf(command("inspect"), adjacent)]
/// Show the capabilities available to the process
struct Inspect {
  #[bpaf(external(capset), collect, group_help("Capability sets to display:"))]
  sets: HashSet<CapSet>,
}

Then again, the inspect command is marked optional, while the HashSet can be an empty collection. If it were possible to flatten out the Inspect struct, the optional annotation may not communicate clearly if it's associated to the command or the collection? (EDIT: It doesn't seem like it's valid to use command annotation on a struct field, even without external)

Likewise for the command help vs group_help for the enum? (_EDIT: I can place a doc comment on the enum as equivalent to group_help annotation_)

I tried:

struct Args {
  // ...

  #[bpaf(command("inspect"), optional, collect)]
  sets: HashSet<Set>,
}

but it fails with:

error: Unexpected attribute in field annotation

23 |   #[bpaf(command("inspect"), optional, collect)]
   |          ^^^^^^^

Some doc examples show the annotation on structs, while others have them on individual enum variants.

pacak commented 8 months ago
23 |   #[bpaf(command("inspect"), optional, collect)]

The way I think about those attributes is that they are chained sequentially: command("inspect").optional().collect() with some extra smarts added on top of that, command cannot be an attribute on a field since, at least at the moment.

Here's my go at implementing inspect command

//! Add an optional command inspect (-a | -b | -e | -i | -p)... that takes some extra optional flags and collects them into a set

use bpaf::*;
use std::collections::BTreeSet;

// start by making a set of arguments parser will be used for capability set
#[derive(Debug, Bpaf, Clone, Eq, PartialEq, PartialOrd, Ord)]
enum CapSet {
    #[bpaf(short, long)]
    Ambient,
    #[bpaf(short, long)]
    Bounding,
    #[bpaf(short, long)]
    Effective,
    #[bpaf(short, long)]
    Inheritable,
    #[bpaf(short, long)]
    Permitted,
}

// Inspect command refers to capabilities using `external` - parses a single flag, then runs it as
// for as long as inner parser (here - cap_set) keeps producing values, collects results into a set
//
// Parser contains command annotation on top to make it into a subcommand parser
#[derive(Bpaf, Debug, Clone)]
#[bpaf(command)]
struct Inspect {
    #[bpaf(external, collect)]
    cap_set: BTreeSet<CapSet>,
}

#[derive(Debug, Clone, Bpaf)]
#[bpaf(options)]
struct Options {
    // regular flag :)
    #[bpaf(fallback(8080))]
    port: usize,

    // inspect gets parsed from an external command, here - inspect()
    // extracts cap_set fields with `map`
    // and makes the whole thing optional if it fails
    #[bpaf(external, map(|f| f.cap_set), optional)]
    inspect: Option<BTreeSet<CapSet>>,
}

fn main() {
    println!("{:?}", options().run());
}
pacak commented 8 months ago

The only thing it doesn't do is populating a set with all the capabilities when none passed. Ideally it should be done with a polymorphic variation of some which isn't yet implemented - mostly due to historical reasons, should be possible to add it.

But you can perform this operation either when extracting cap_set field - map can contain arbitrary expressions, or when parsing Inspect structure - by adding a map after collect

pacak commented 8 months ago

Should include a space between the binary name and supports

This was a bug, fixed.

options (-e -b ..) line break seems unintentional? Doesn't seem to matter what terminal width is (WSL2 Terminal)

Getting terminal width, especially in a cross-platform way requires a bunch of external dependencies and doesn't add a lot of values so I'm assuming terminal width to be 100 and use that. Linebreak you see comes from the error message being adjusted to fit into that width.

pacak commented 8 months ago

I get the impression that these command docs (OptionParser) might be trying to indicate the issue I'm running into and how to avoid it, similar to the common Output example I've seen elsewhere in the docs that suggests using adjacent.

adjacent for command exists mostly to allow you to chain multiple commands sequentially and parsers from outside of subcommands are still going to be able to consume flags that technically belong to a subcommand - like -p in your case. This is partially intentional so you can add things like --verbose and the end without having to scroll before the subcommand and partially because I don't see an efficient way of implementing it the other way around. Not sharing names is a way to go.

pacak commented 8 months ago

It would be neat if the enum itself could be annotated with defaults, since you can see that I repeated the short/long annotation for each variant. Just a minor cosmetic improvement, not too important :)

This is similar to https://github.com/pacak/bpaf/issues/178 and suffers from the same problem, though not immediately. I'm already using short and long on the top level to add short/long aliases to command and behavior and what's accepted changes when you make outer enum to a command, like so:

#[bpaf(command, short, long)]
enum CapSet {
  Ambient,
  ...
}

I might get away with adding something like #[bpaf(each_child(short, long))], but that's still adding a bunch more complexity to rules without adding a whole lot of value.

polarathene commented 8 months ago

Getting terminal width, especially in a cross-platform way requires a bunch of external dependencies and doesn't add a lot of values so I'm assuming terminal width to be 100 and use that

You could use a feature gate if that was a concern? Then just document what the rough overhead in release binary weight is or build time if that is worth drawing attention to? (like done here for CLI arg parsing crates)

bpaf is a bit weighty already in comparison to the lightweight ones, so that may not be a concern for users of bpaf who may prefer it for the DX / features or improved UX for their audience (which this would benefit).


adjacent for command exists mostly to allow you to chain multiple commands sequentially and parsers from outside of subcommands are still going to be able to consume flags that technically belong to a subcommand - like -p in your case.

Oh, that was kinda weird then. I guess it could be better communicated within the context of command + adjacent annotations?

As the command docs I referenced describe consuming everything to the right, that was my expectation. I added adjacent based on it's description to better enforce scope boundaries, since depending on position of my command in the struct, it affected the scope, yet as shown did not really help prevent inspect -a being treated as -a inspect (when I made --aware a short) - although adjacent did affect it inconsistently (compared to other shorts that did not overlap/conflict).


This is partially intentional so you can add things like --verbose and the end without having to scroll before the subcommand and partially because I don't see an efficient way of implementing it the other way around. Not sharing names is a way to go.

Fair enough, I don't have a --verbose / -v in my case and noticed that version provided -V as the short, presumably because -v is commonly used enough for --verbose short and keeping -V for version regardless maintains consistency as a default 😅

Is it possible to emit a warning if bpaf can detect this sort of conflict is possible? I'm not familiar if clippy is pluggable, but might be a good DX there with a link to documentation about the scenario advising caution with potential workarounds / gotchas.

polarathene commented 8 months ago

I might get away with adding something like #[bpaf(each_child(short, long))], but that's still adding a bunch more complexity to rules without adding a whole lot of value.

In my experience, when a command name is inferred and gets annotated only to lowercase it, that's likely a preference throughout the config. (EDIT: For some reason I thought for sure this was weirdly not lower-case by default 😝 )

Similar with usage here, I'm overriding the implicit default. So another way to look at it might be a way to configure the defaults? For the derive API, perhaps that could even be a struct that the top-level derived struct could refer to by annotation? 🤷‍♂️ I'm not sure how the internals work, so if that's too much trouble no worries 👍

When I explored a refactor by using an enum to represent two separate commands, it looked a bit more "noisy", but I'm not sure if you could improve on that much either?:

#[derive(Bpaf)]
pub enum SubCommands {
  /// Bind to a port
  #[bpaf(command)]
  Bind(
    #[bpaf(external(bind_args))]
    BindArgs
  ),

  /// Show the capabilities available to the process
  #[bpaf(command)]
  Inspect(
    #[bpaf(external(cap_set), collect, parse(with_fallback))]
    HashSet<CapSet>
  ),
}

I know I can collapse the inner variant value with annotation before it into a single line. I don't think that helps with readability though 😅

When it's only needing annotation for the direct inner value perhaps something like this could be viable? (for BindArgs, I have a separate struct annotating fields separately, but external is necessary?)

#[derive(Bpaf)]
pub enum SubCommands {
  #[bpaf(
    command,
    inner(external(bind_args))
  )]
  /// Bind to a port
  Bind(BindArgs),

  #[bpaf(
    command,
    inner(external(cap_set), collect, parse(with_fallback))
  )]
  /// Show the capabilities available to the process
  Inspect(HashSet<CapSet>),
}

and with the command annotation hoisting:

#[derive(Bpaf)]
#[bpaf(all(command))]
pub enum SubCommands {
  #[bpaf( inner(external(bind_args)) )]
  /// Bind to a port
  Bind(BindArgs),

  #[bpaf( inner(external(cap_set), collect, parse(with_fallback)) )]
  /// Show the capabilities available to the process
  Inspect(HashSet<CapSet>),
}

all is probably a good choice to express apply that annotation to all direct children?

I think that looks nicer. inner would still be necessary to allow overrides to all (_such as in the req_flag derive example to skip annotating an enum variant_).

For the enum group of switches (req_flags?):

enum CapSet {
    #[bpaf(short, long)]
    Ambient,
    #[bpaf(short, long)]
    Bounding,
    #[bpaf(short, long)]
    Effective,
    #[bpaf(short, long)]
    Inheritable,
    #[bpaf(short, long)]
    Permitted,
}

it'd also look much nicer:

#[bpaf(all(short, long))]
enum CapSet {
    Ambient,
    Bounding,
    Effective,
    Inheritable,
    Permitted,
}

If that's adding too much complexity for bpaf don't worry about it, this is just cosmetic to the DX only :)

pacak commented 8 months ago

You could use a feature gate if that was a concern?

It's about compilation time. You need to pull at least libc and more. I don't think there's any other cli parser crates out there that do it, at least cargo happily wraps the lines if your terminal is slightly narrower than its output.

Is it possible to emit a warning if bpaf can detect this sort of conflict is possible?

It should be possible, but having different flags to use the same name is a feature. I started working on bpaf when I needed to modify existing program to accept an optional value to an existing flag: before program was accepting -o which was parsed as bool, the idea was to make it to accept -o as well as -o foo.txt and parse that as Option<String>. To avoid combinatoric explosions API lets you make separate parsers for -o and -o foo.txt and combine them. Warning defeats the purpose.

If that's adding too much complexity for bpaf don't worry about it, this is just cosmetic to the DX only :)

This doesn't add too much complexity to the derive macro itself, but makes API I must teach users to use more complex and introduces a bunch of corner cases I'm trying to avoid...

polarathene commented 8 months ago

It's about compilation time. You need to pull at least libc and more.

I thought that was the purpose of cargo feature flags? To only install / compile deps when the related opt-in features are enabled?

I don't mind it, I just found it a bit odd when my terminal had plenty of width at runtime, or too short of width that the text was always split to a new line at a fixed length (with the terminal wrapping to multi-line if it got too short during a resize afterwards, yet the fixed length split making that awkward regardless of terminal width).

If it wasn't clear in my example, there is a enforced linebreak after "individual", even when I run the command with enough width to fit all on a single line.


having different flags to use the same name is a feature.

Except when you get the experience like my example of a subcommand using the flag and it triggering the flag intended only before the subcommand? 🤔

Warning defeats the purpose.

In that case they're still using -o in the same scope? So I'd agree 🤷‍♂️

By conflicting I meant like the --aware vs --ambient short -a bug example, which if that sort of thing can be detected a warning might be helpful?

Perhaps it's less of an issue with combinatoric API, but seems easy enough to run into with Derive API?


makes API I must teach users to use more complex

I would disagree, it's already complex enough that it feels like guess work for where to place the annotation at times?

I don't think the docs call out:

Given that, I think the all annotation is fairly clear to communicate as a convenience annotation and the benefit it offers with the req_flag enum example? Less so with the subcommands, but still somewhat useful and DRY.

inner annotation suggestion, similarly is a convenience too. I'd argue it is similar to external in complexity, perhaps less so. The usage and benefit is easy to communicate with a before/after example like I've done. Reduces that nested annotation.

Here's a Clap example:

See how the enum of subcommands is much nicer?


introduces a bunch of corner cases I'm trying to avoid...

That's completely fair 👍

This sort of project is not simple, I really appreciate your efforts! ❤️

If bpaf doesn't suit a project for me, that's fine. I haven't tried the combinatoric API, but imagine that is a strength of bpaf. You don't need to cater to all audiences (I tried argh earlier, but it was a bit limited for what I wanted so arrived at bpaf and was happy).