Open corneliusroemer opened 5 months ago
Short-term: I would call this --metadata-date-columns
, accepting multiple values and using the first that's available. That would maintain consistency with --metadata-id-columns
/--metadata-delimiters
. The new option should be available wherever those are available.
Long-term: The need to specify metadata parameters for every augur subcommand is a bit tedious (example: https://github.com/nextstrain/mpox/commit/927ad6cdf0f7e96384ab8a53f87aee7b5c4e658b) and prone to human error when updating. Under expected usage of Augur, it's likely the case that the same metadata parameters will be used across all commands in a given project/workflow. Configuration through environment variables can reduce duplication. Something like:
export AUGUR_METADATA_DELIMITER=;
export AUGUR_METADATA_ID_COLUMN=accession
export AUGUR_METADATA_DATE_COLUMN=collection_date
# Now these don't need to specify --metadata-delimiters, --metadata-id-columns, --metadata-date-columns
augur filter …
augur traits …
augur refine …
augur export v2 …
Context
When working with data straight off ncbi, the collection date column is called
Isolate Collection date
rather thandate
- we also often call itcollection_date
to distinguish fromrelease_date
orupdate_date
.It would be nice if one could configure the date column via argument, similar to
--metadata-id-column
, e.g--collection-date-column="Isolate Collection date"