A pipeline that ingests SARS-CoV-2 (i.e. nCoV) data from GISAID and Genbank, transforms it, stores it on S3, and triggers Nextstrain nCoV rebuilds.
35
stars
20
forks
source link
Make use of genbank's `purpose_of_sampling` field #415
Open
corneliusroemer opened 11 months ago
Context
It's possible to get the purpose of sampling from genbank via
datasets summary virus genome taxon sars-cov-2
, see https://github.com/GenSpectrum/LAPIS/issues/328#issuecomment-1673639689It would be nice if we parsed that field into our metadata.tsv so one can filter for baseline vs airport surveillance, for example.