mitoNGS / MToolBox

A bioinformatics pipeline to analyze mtDNA from NGS data
http://sourceforge.net/projects/mtoolbox/?source=navbar
GNU General Public License v3.0
89 stars 37 forks source link

.fastq files stay uncompressed for a long time if input is bam #78

Open dwuab opened 5 years ago

dwuab commented 5 years ago

I don't think this is a bug, but the default behavior could be problematic. If the input files are of .bam format, .bam files will be converted into .fastq files, uncompressed (according to bam_input() in MToolBox.sh). The uncompressed files will be compressed only after all .bam files have been converted (according to fastq_input() in MToolBox.sh). So during the time of conversion, there will be lots of uncompressed .fastq files in the output directory, occupying ridiculous amounts of space if you have lots of input samples. I think .fastq files should be compressed right after they were created.

domenico-simone commented 5 years ago

Hi,

thanks for this remark, this is a possible improvement for the next update.

Domenico

dwuab commented 5 years ago

Hi,

thanks for this remark, this is a possible improvement for the next update.

Domenico

Thanks for your reply! I also noticed that in the final results, there are uncompressed .sam and .fastq files. Would be great if those are compressed as soon as they are generated.