nextstrain / pathogen-repo-guide

4 stars 1 forks source link

ingest/nextclade: Replace `csvtk` #55

Closed joverlee521 closed 1 month ago

joverlee521 commented 3 months ago

Prompted by https://github.com/nextstrain/pathogen-repo-guide/pull/54#issuecomment-2229193810

Taking a closer look at the use of csvtk in the nextclade.smk

https://github.com/nextstrain/pathogen-repo-guide/blob/06e6f275daca928ba0d39f49b621a8aadfc0cff2/ingest/rules/nextclade.smk#L78-L85

Seems like we can replace the two csvtk commands in the pipeline with tsv-select and augur curate rename so that we are not blocked by https://github.com/shenwei356/csvtk/issues/283.

joverlee521 commented 3 months ago

Replacing csvtk rename2 with augur curate rename will take a little extra munging by either:

  1. Move the field map in the nextclade_field_map.tsv to the nextclade_config.yaml.
  2. Add a new option to augur curate rename to take a TSV file for the field map.
joverlee521 commented 1 month ago

Option 1 looks cleaner in the context of the measles repo: https://github.com/nextstrain/measles/pull/52/commits/faebd646c3e2d19b0492523c760340df99493a22