LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
169 stars 34 forks source link

Sample data #161

Closed woodoo46 closed 3 years ago

woodoo46 commented 3 years ago

Hi there,

It might be mentioned somewhere, but I am just curious what kind of data is inside the example file "http://s3.amazonaws.com/nanopore-human-wgs/bulkfile/PLSP57501_20170308_FNFAF14035_MN16458_sequencing_run_NOTT_Hum_wh1rs2_60428.fast5" ? For example, how many reads do we expect to see, how much coverage on human genome and is it whole genome sequencing or a targeted sequencing?

Thanks.

George

alexomics commented 3 years ago

This bulk FAST5 file is from a run for Nanopore sequencing and assembly of a human genome with ultra-long reads, it was un-targeted whole genome sequencing. From the data page for that project (flow cell FAF14035) I can see that run generated 101,222 reads, but I think this bulk FAST5 file captured roughly 13,000 reads. I'm not sure what duration it captures but MinKNOW will repeat playback/simulation in a loop until the set run duration has elapsed. Coverage will be repeated only over regions seen in the file.