nextstrain / dengue

Nextstrain build for dengue virus
https://nextstrain.org/dengue
8 stars 10 forks source link

Establish some deduplication guidelines within the phylogenetic workflow #30

Open j23414 opened 7 months ago

j23414 commented 7 months ago

Context

Flagged by https://github.com/nextstrain/dengue/issues/28#issuecomment-1951297740 as well as prior historical discussions.

Design and implement some deduplication paths in the phylogentic workflow.

Description

Examples

Possible solution

Preferably, leverage the existing tools in the nextstrain dockerfile, with seqkit being a probable choice.

j23414 commented 4 months ago

Flagging some duplicates form a slack message here