CDCgov / datasets-sars-cov-2

Benchmark datasets for WGS analysis of SARS-CoV-2. (https://peerj.com/articles/13821/)
Apache License 2.0
54 stars 18 forks source link

Made fastq-dump work properly for paired reads #14

Closed BioWilko closed 2 years ago

BioWilko commented 2 years ago

Changed fastq-dump --split-files to --split-3 and added logic to handle the files which --split-3 produces, also removed the use of --gzip since it is deprecated and can lead to corrupted gz files, it does chuck out unpaired reads so if you wish to add some handling for that it could be a good idea.

Much slower unfortunately but that's the price for it actually working...

lskatz commented 2 years ago

Thank you for your input. I have incorporated this pull request into #17