lebrice / SimpleParsing

Simple, Elegant, Typed Argument Parsing with argparse
MIT License
401 stars 50 forks source link

Confusing command line options for bool's with `default=True` #68

Closed rggjan closed 1 year ago

rggjan commented 3 years ago

Having a boolean option with a default value of True:

from dataclasses import dataclass

from simple_parsing import ArgumentParser

@dataclass
class Args:
    enable_water: bool = True  # Do you want water?

parser = ArgumentParser()
parser.add_arguments(Args, dest="args")

print(parser.parse_args(["--enable_water"]))
# => Namespace(args=Args(enable_water=False))

Creates this very confusing behaviour that passing the option (without a boolean value) disables it, so passing enable_water actually causes enable_water to be set to False, which is not very intuitive.

Even though this behaviour corresponds to store_false in the classic python ArgumentParser, it would be nicer to make this more clear. For example by changing the command line option automatically to --no_enable_water (by prepending a no_) when the default is False, and then setting enable_water to False when the option --no_enable_water is passed.

lebrice commented 3 years ago

Hey @rggjan, thanks for posting.

At the moment, if the default is True, then the action is "store_false" and vice-versa. It's kind-of up to the user to use a good name for the fields depending on the default value, which I'll admit might not be ideal.

If you want something that's 100% unambiguous, you can use something like this to force passing a value:

from simple_parsing.helpers import field

def flag(default: bool, *args, **kwargs):
    return field(default=default, nargs=1, *args, **kwargs)

@dataclass
class Args:
    water: bool = flag(True)  # Do you want water?

I'm not sure what would be the best default behaviour (in terms of nargs) when adding simple boolean fields, but I think the current behaviour is alright for now.

Let me know what you think.

rggjan commented 3 years ago

Thanks for the idea. I'm still not convinced this can be fixed with a better naming... can you give me a single example where the current behavior makes sense for a default which is True?

Because also calling:

@dataclass
class Args:
  no_water: bool = True

...

if args.no_water:
  print("no water!")
else:
  print("water!")

with script.py --no-water

Will print: water!

Which is equally unintuitive 😬

rggjan commented 3 years ago

Also, I don't want to force passing the value, I'm perfectly fine with having a default.

This would be my proposal for more intuitive behavior:

water: bool = False

would result in

While

water: bool = True

would result in

lebrice commented 3 years ago

can you give me a single example where the current behavior makes sense for a default which is True?

Good point!

  1. For the default value, the main "culprit" is this this block here

  2. For the generated --no-xyz option: Currently, fields all create only one argument here, in the DataclassWrapper

    You'll notice that there was a mention of a other_default (commented out above the line of the first link right above), which I was testing out a while back, for adding specifically what you were describing (--no-water field).

    However, I think it might be best to just add something like an argument with a store_false action and a dest referencing the first field in the DataclassWrapper (right below line 103 I linked to above). Such an argument would have to only be added whenever we detect that the field has a boolean type and default value of True.

    Let me know what you think!

rggjan commented 3 years ago

Thanks for the comments.

  1. That makes sense. I wonder what the best way of changing this would be, since that would be a breaking change and people might rely on the current semantic (although I can't imagine it, really).

  2. Yes, that would make sense. Although I wonder if it might even be useful for boolean fields with a default of False for consistency, so changing a default value wouldn't change the allowed command line flags...

rggjan commented 3 years ago

(In other words, always add a --no-water flag, which would be equivalent to --water False, no matter what default was set)

haydenflinner commented 2 years ago

I actually came to create an issue and pull request for this as a bug, until I also saw that it was intended behavior. Given that boolean variables should almost always be named in the affirmative (1 2), it looked like bug behavior to me that "--my-flag" on a field that defaulted to True actually resulted in a negative value. I've attached a PR that tries to cover all cases. For example, perhaps you have a boolean flag that defaults to False now, so the cmdline flag is named --enable-x. Later on you change the default to True; it would be nice to not have to remove that flag from any scripts calling your script. Similarly, we'd like to retain support for short flags like verbose=False (turned on with simply -v).

Conchylicultor commented 2 years ago

+1 for auto-adding a --noflag for each boolean --flag.

This is how Google's absl FLAGS works: https://abseil.io/docs/python/guides/flags

flags.DEFINE_boolean('debug', False, 'Produces debugging output.')

Accept:

Irrespectively of the default value

rggjan commented 2 years ago

Exactly! Although I think I'd prefer the flag with a -, like --no-debug, instead of --nodebug...

lebrice commented 2 years ago

Sorry this has taken a while. The PR for this (#94) was getting quite stale, and I didn't like having a custom BoolAction class for it.

Ok I'll push this back on top of my stack of todos. In the meantime, if someone would like to chip in, I'd be happy to give them pointers on how this could be most easily implemented.

Conchylicultor commented 1 year ago

I might look into this as some Google internal tools automatically pass --nosome_flag when setting some_flag=false in configuration. So this is breaking us when using simple_parsing.

In absl, they have some wrapper around argparse which add the --noflag symbol: https://github.com/abseil/abseil-py/blob/a0ae31683e6cf3667886c500327f292c893a1740/absl/flags/argparse_flags.py#L222

In the meantime, if someone would like to chip in, I'd be happy to give them pointers on how this could be most easily implemented.

I'm interested. Do you have some pointers on this ?

Note that the absl.flags argparse wrapper also use a custom _BooleanFlagAction. What would be the alternative ?

lebrice commented 1 year ago

Hey @Conchylicultor I'm sorry I didn't get back to you sooner.

Yeah I've got pointers to give, if you're still interested, and I'm also hoping to work on this soon.

The solution involves a few steps. First would be to change how the default is generated in the FieldWrapper.default property. Be advised, the code for that class is pretty ugly and in dire need of a refactor.

Then, the second thing to fix is that when in the DataclassWrapper.add_arguments method, the arg_options of the FieldWrapperare fed toargument_group.add_arguments, we'd insert a check, whereby if the fieldwrapper is for a boolean field, we create a mutually exclusive group of positive and negative arguments. Thearg_options` we have could be for the positive field. We'd have to add a new argument with the "--no-" prefix for the negative option, with the same destination.

LMK if that makes sense.