nh13 / DWGSIM

Whole Genome Simulator for Next-Generation Sequencing
GNU General Public License v2.0
92 stars 36 forks source link

Feature to create gzipped output files instead of plaintext #40

Closed niemasd closed 7 years ago

niemasd commented 7 years ago

I am trying to use DWGSIM to simulate multiple replicates of WGS Illumina reads at 30x coverage on hg19, which creates massively large files. Because the output files are not gzipped, I end up wasting space and cannot run as many in parallel because I need to wait for the running ones to finish, then delete everything I don't need and gzip .bwa.read1.fastq and .bwa.read2.fastq, then kick off more.

It would be a nice feature to be able to create gzipped output files instead of plaintext to save a ton of space, especially given that most people work with gzipped FASTQ files anyways

nh13 commented 7 years ago

I am making the FASTQs output in gzip format by default as I don't see any harm in that.

niemasd commented 7 years ago

Wow, you're awesome! Thanks so much for getting to these so quickly! And great work with the tool!