Open joverlee521 opened 9 months ago
I briefly explored if I could recreate Cornelius' script with csvtk mutate2, but ran into an error:
$ csvtk -t mutate2 -e ' $country + "/" + $accession + "/" + $date ' -n strain_display -s monkeypox-metadata.tsv
[ERRO] Cannot transition token types from MODIFIER [+] to TIME [2007-10-30 00:00:00 -0700 PDT]
Edit: csvtk also converts dates to floats. This behavior will not change until the underlying evaluation package is updated.
Context
Following the naming pattern set in SARS-CoV-2 sequences, strain names are usually
<country>/<sample_id>/<year>
. All three fields are typically available in the metadata so we can concatenate them to "build" a reasonable strain name.Description
We could extend the existing
augur curate transform-strain-name
to accept input columns that are concatenated with a provided separator.Examples