nextstrain / lassa

Nextstrain build for Lassa virus
https://nextstrain.org/staging/lassa
0 stars 5 forks source link

Add consistency in pathogen input files #4

Closed j23414 closed 11 months ago

j23414 commented 1 year ago

Description of proposed changes

This commit addresses an inconsistency and moves toward standardizing the input file structure for various pathogen pipelines.

Previously, the lassa pipeline parsed a lassa{l/s}.fasta file to generate sequences{l/s}.fasta and metadata{l/s}.tsv files. However, other pathogen pipelines (such as monkeypox, ncov, dengue, zika) directly accept a pair of sequences{x}.fasta and metadata_{x}.tsv as input files. To align with this convention and enhance clarity, this commit introduces changes to promote consistency across the pipeline.

Although do post in this thread if the lassa_{l/s}.fasta is being used in a live build with written reasons why this needs to be backward compatible.

Related issue(s)

Testing

Commands for a manual check:

git clone https://github.com/nextstrain/lassa.git original
git clone https://github.com/nextstrain/lassa.git modified

cd original
cp -r example_data data
nextstrain build .
nextstrain view auspice

cd ../modified
git checkout consistent_inputs
cp -r example_data data
nextstrain build .
nextstrain view auspice
victorlin commented 11 months ago

@j23414 I'm assuming you used augur parse to make this change?

j23414 commented 11 months ago

Correct! I ran augur parse on the example_data/*.fasta to generate the new example data