nextstrain / dengue

Nextstrain build for dengue virus
https://nextstrain.org/dengue
8 stars 10 forks source link

Use "accession" column as ID column #12

Closed j23414 closed 11 months ago

j23414 commented 11 months ago

Description of proposed changes

The main purpose of this commit is to ID records by "accession" to directly match changes in https://github.com/nextstrain/monkeypox/commit/927ad6cdf0f7e96384ab8a53f87aee7b5c4e658b

Additionally, the uncompressed sequence and metadata files are moved to the data folder instead of the results folder to align with the monkeypox and zika pipelines.

Related issue(s)

Checklist

git clone https://github.com/nextstrain/dengue.git
cd dengue
git checkout id_by_accession

nextstrain build \
  --aws-batch \
  --aws-batch-cpus 4 \
  --aws-batch-memory 7200 . --jobs 4
j23414 commented 11 months ago

Thanks @jameshadfield for the review!

the "strain" column in the Dengue metadata needs improvement

Thanks for the context on how strain names affect Auspice, along with the console messages! Completely agree, yes, I'm hoping to discuss and address the "strain" column transformations in future PRs.

j23414 commented 11 months ago

After a quick check-in with @joverlee521, I'm going to merge this for the time being. I'll plan for subsequent issues and PRs to better address outstanding discussion points.