nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 128 forks source link

CLI docs have incorrect default values for options using `store_false` action #1585

Open victorlin opened 2 months ago

victorlin commented 2 months ago

Scope

Any CLI option that uses the store_false action, which are these options on version 25.2.0:

Issue

The CLI docs show a default value of True for these options. I noticed this because it was confusing on the CLI docs for augur filter, where --probabilistic-sampling and --no-probabilistic-sampling are supposed to be mutually exclusive:

image

augur filter --help text does not contradict, but there is no text provided for --no-probabilistic-sampling which still isn't great:

  --probabilistic-sampling
                        Allow probabilistic sampling during subsampling. This is useful when there are more groups than requested sequences. This option only applies when `--subsample-max-sequences` is
                        provided. (default: True)
  --no-probabilistic-sampling

and then I realized that the lack of help text may be intentional to avoid the same issue:

https://github.com/nextstrain/augur/blob/988380c0c65efcabcc6c87a7967b0e2bcc41a0fc/augur/refine.py#L125

Possible solutions

  1. Describe defaults in help text and avoid printing any defaults according to argparse. I think this is what Nextstrain CLI does for its arguments that use store_false such as nextstrain build --no-download.
  2. Use store_true and flip the boolean internally. Something similar was done in #613
corneliusroemer commented 2 months ago

I'm pretty sure I remember noticing this in the past - but I must have forgotten to make an issue. Thanks!

joverlee521 commented 2 months ago

Ah, I thought I fixed this issue for augur curate format-dates --no-mask-failure with https://github.com/nextstrain/augur/commit/4d02220719b66b5f775dfe23d71e81a44dd69699.

Looks okay in the CLI

augur curate format-dates -h
...
  --no-mask-failure     Do not mask dates with 'XXXX-XX-XX' and return original date string if date formatting failed. (default:
                        False)

But I'm seeing that this can be confusing in the docs Screenshot 2024-08-20 at 10 43 10 AM