artic-network / artic-ncov2019

ARTIC nanopore protocol for nCoV2019 novel coronavirus
Creative Commons Attribution 4.0 International
168 stars 166 forks source link

Test Datasets #69

Open kek12e opened 3 years ago

kek12e commented 3 years ago

I just posted this same question to the fieldbioinformatics issues board as well, not sure which place is better to query!

Hello,

I was wondering if you knew of any good publicly available datasets for the V3 Artic Tiling Amplicon sequencing of hCoV-19. Ideally I would love to have a test dataset showing each of the variants of concern (UK, South Africa, and Brazil) along with the Wuhaun strain.

I've tried looking through GISAID and SRA, but as far as I can tell GISAID only supplies the preassembled genomes in a strange format and I need the raw fast5 or fastq files. And SRA is just very challenging to search to get exactly the type of library/sequence set you need that has enough metadata to inform the analysis.

I apologize if there are datasets somewhere already, but I'm somewhat frantically trying to figure out how to do this ARTIC analysis before teaching a course on it that begins on Monday. We sequenced synthetic salvia that contained RNA for the Wuhaun strain and three variants of concern, but for some reason the variant calling is coming up with nothing and I'm struggling to figure out if it's our data or if it's something going wrong in the pipeline.

I would sincerely appreciate any help or a pointer to appropriate public datasets that have done the ARTIC V3 tiling amplicon approach and have worked with the standard SOP bioinformatics pipeline.

Sincerely, Katie