Closed rggjan closed 1 year ago
Hey @rggjan, thanks for posting.
At the moment, if the default is True, then the action is "store_false" and vice-versa. It's kind-of up to the user to use a good name for the fields depending on the default value, which I'll admit might not be ideal.
If you want something that's 100% unambiguous, you can use something like this to force passing a value:
from simple_parsing.helpers import field
def flag(default: bool, *args, **kwargs):
return field(default=default, nargs=1, *args, **kwargs)
@dataclass
class Args:
water: bool = flag(True) # Do you want water?
I'm not sure what would be the best default behaviour (in terms of nargs) when adding simple boolean fields, but I think the current behaviour is alright for now.
Let me know what you think.
Thanks for the idea. I'm still not convinced this can be fixed with a better naming... can you give me a single example where the current behavior makes sense for a default which is True?
Because also calling:
@dataclass
class Args:
no_water: bool = True
...
if args.no_water:
print("no water!")
else:
print("water!")
with script.py --no-water
Will print:
water!
Which is equally unintuitive 😬
Also, I don't want to force passing the value, I'm perfectly fine with having a default.
This would be my proposal for more intuitive behavior:
water: bool = False
would result in
<empty command line>
water=False--water
water=True--no-water
water=False (not available now)--water True
water=True--water False
water=FalseWhile
water: bool = True
would result in
<empty command line>
water=True--water
water=True (currently results in water=False)--no-water
water=False (not available now)--water True
water=True--water False
water=Falsecan you give me a single example where the current behavior makes sense for a default which is True?
Good point!
For the default value, the main "culprit" is this this block here
For the generated --no-xyz
option:
Currently, fields all create only one argument here, in the DataclassWrapper
You'll notice that there was a mention of a other_default
(commented out above the line of the first link right above), which I was testing out a while back, for adding specifically what you were describing (--no-water
field).
However, I think it might be best to just add something like an argument with a store_false
action and a dest
referencing the first field in the DataclassWrapper
(right below line 103 I linked to above). Such an argument would have to only be added whenever we detect that the field has a boolean type and default value of True
.
Let me know what you think!
Thanks for the comments.
That makes sense. I wonder what the best way of changing this would be, since that would be a breaking change and people might rely on the current semantic (although I can't imagine it, really).
Yes, that would make sense. Although I wonder if it might even be useful for boolean fields with a default of False
for consistency, so changing a default value wouldn't change the allowed command line flags...
(In other words, always add a --no-water
flag, which would be equivalent to --water False
, no matter what default was set)
I actually came to create an issue and pull request for this as a bug, until I also saw that it was intended behavior. Given that boolean variables should almost always be named in the affirmative (1 2), it looked like bug behavior to me that "--my-flag" on a field that defaulted to True actually resulted in a negative value. I've attached a PR that tries to cover all cases. For example, perhaps you have a boolean flag that defaults to False now, so the cmdline flag is named --enable-x. Later on you change the default to True; it would be nice to not have to remove that flag from any scripts calling your script. Similarly, we'd like to retain support for short flags like verbose=False (turned on with simply -v).
+1 for auto-adding a --noflag
for each boolean --flag
.
This is how Google's absl FLAGS
works: https://abseil.io/docs/python/guides/flags
flags.DEFINE_boolean('debug', False, 'Produces debugging output.')
Accept:
--debug
(FLAGS.debug=True
)--nodebug
(FLAGS.debug=False
)Irrespectively of the default value
Exactly! Although I think I'd prefer the flag with a -
, like --no-debug
, instead of --nodebug
...
Sorry this has taken a while. The PR for this (#94) was getting quite stale, and I didn't like having a custom BoolAction class for it.
Ok I'll push this back on top of my stack of todos. In the meantime, if someone would like to chip in, I'd be happy to give them pointers on how this could be most easily implemented.
I might look into this as some Google internal tools automatically pass --nosome_flag
when setting some_flag=false
in configuration. So this is breaking us when using simple_parsing
.
In absl
, they have some wrapper around argparse
which add the --noflag
symbol: https://github.com/abseil/abseil-py/blob/a0ae31683e6cf3667886c500327f292c893a1740/absl/flags/argparse_flags.py#L222
In the meantime, if someone would like to chip in, I'd be happy to give them pointers on how this could be most easily implemented.
I'm interested. Do you have some pointers on this ?
Note that the absl.flags
argparse wrapper also use a custom _BooleanFlagAction
. What would be the alternative ?
Hey @Conchylicultor I'm sorry I didn't get back to you sooner.
Yeah I've got pointers to give, if you're still interested, and I'm also hoping to work on this soon.
The solution involves a few steps. First would be to change how the default
is generated in the FieldWrapper.default
property.
Be advised, the code for that class is pretty ugly and in dire need of a refactor.
Then, the second thing to fix is that when in the DataclassWrapper.add_arguments
method, the arg_options
of the FieldWrapperare fed to
argument_group.add_arguments, we'd insert a check, whereby if the fieldwrapper is for a boolean field, we create a mutually exclusive group of positive and negative arguments. The
arg_options` we have could be for the positive field. We'd have to add a new argument with the "--no-" prefix for the negative option, with the same destination.
LMK if that makes sense.
Having a boolean option with a default value of
True
:Creates this very confusing behaviour that passing the option (without a boolean value) disables it, so passing
enable_water
actually causesenable_water
to be set to False, which is not very intuitive.Even though this behaviour corresponds to
store_false
in the classic pythonArgumentParser
, it would be nicer to make this more clear. For example by changing the command line option automatically to--no_enable_water
(by prepending ano_
) when the default isFalse
, and then settingenable_water
to False when the option--no_enable_water
is passed.