nextstrain / avian-flu

Nextstrain build for avian influenza viruses
http://nextstrain.org/avian-flu
19 stars 7 forks source link

ingest: Join NCBI and Andersen lab data #42

Closed joverlee521 closed 3 months ago

joverlee521 commented 4 months ago

Follow up to https://github.com/nextstrain/avian-flu/pull/28 + https://github.com/nextstrain/avian-flu/pull/40

The Andersen lab will continue to update consensus sequences from SRA runs. These eventually get uploaded to GenBank by the original submitters and become available through the NCBI data. However, there is a delay in the data through NCBI, so we can merge the Andersen lab data with the NCBI data to get the latest available sequences.

Both sets of data have the SRA accession, so we can use that to dedup the data.