nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 128 forks source link

Improve help text of CLI options that use `extend` and `ExtendOverwriteDefault` actions #1654

Open joverlee521 opened 1 month ago

joverlee521 commented 1 month ago

Initially discussed in https://github.com/nextstrain/augur/pull/1653#discussion_r1805272627.

The default behavior for the extend action in argparse is to extend the defaults. We have a custom ExtendOverwriteDefault action that overwrites the defaults instead of extending them, but this custom action does not change the output of the CLI help docs.

Comparing --metadata-delimiters which uses ExtendOverwriteDefault and --expected-date-formats which uses the default extend action from augur curate format-dates. The help text does not differentiate the extend vs overwrite behavior.

--metadata-delimiters METADATA_DELIMITERS [METADATA_DELIMITERS ...]
                        Delimiters to accept when reading a plain text metadata file. Only one delimiter will be inferred. (default: (',', '\t'))
--expected-date-formats EXPECTED_DATE_FORMATS [EXPECTED_DATE_FORMATS ...]
                        Expected date formats that are currently in the provided date fields, defined by standard format codes as listed at
                        https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes. If a date string matches multiple formats, it will be parsed as the
                        first matched format in the provided order. (default: ['%Y-%m-%d', '%Y-%m-XX', '%Y-XX-XX', 'XXXX-XX-XX'])
victorlin commented 1 month ago

I think the current text is ambiguous for both ExtendOverwriteDefault and extend. How about something like this? (Note: properly rendering newline requires #1630)

 --metadata-delimiters METADATA_DELIMITERS [METADATA_DELIMITERS ...]
                         Delimiters to accept when reading a plain text metadata file. Only one delimiter will be inferred.
+                        Specified values will override the default list.
                         (default: (',', '\t'))
 --expected-date-formats EXPECTED_DATE_FORMATS [EXPECTED_DATE_FORMATS ...]
                         Expected date formats that are currently in the provided date fields, defined by standard format codes as listed at
                         https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes. If a date string matches multiple formats, it will be parsed as the
                         first matched format in the provided order.
+                        Specified values will extend the default list.
                         (default: ['%Y-%m-%d', '%Y-%m-XX', '%Y-XX-XX', 'XXXX-XX-XX'])
victorlin commented 1 month ago

This makes me wonder why action="extend" is used over ExtendOverwriteDefault.

In all other usages besides --expected-date-formats, there is no default so it doesn't matter whether ExtendOverwriteDefault or extend is used and I think extend is used out of convenience.

For --expected-date-formats, the list ['%Y-%m-%d', '%Y-%m-XX', '%Y-XX-XX', 'XXXX-XX-XX'] might as well be hardcoded elsewhere in the code and the option be called something like --additional-expected-date-formats.

tsibley commented 1 month ago

Not being able to override the defaults for --expected-date-formats seems like a bug, or at least an oversight.