VDBWRAIR / ngs_mapper

Genome Mapping Pipeline
GNU General Public License v2.0
8 stars 10 forks source link

what is ngs_filter doing? #204

Closed necrolyte2 closed 8 years ago

necrolyte2 commented 8 years ago

I know what it can do, but the stage takes a long time to run even though nothing is selected for it to do.

Not a huge deal, but from an outsider perspective it is confusing

averagehat commented 8 years ago

I changed ngs_filter to simply copy the file over if the settings are set such that nothing would be done. Is this what you're looking for?

necrolyte2 commented 8 years ago

I think we should investigate that stage some more. I saw very high CPU and memory usage while that stage was running which is what concerned me. It also added about 5-10 minutes to the analysis.

I'm wondering if we should consider somehow combining filter and trim out put otherwise you end up with input fastq + filter fastq + trim fastq + bam

Essentially 4x data size

averagehat commented 8 years ago

The code uses lists rather than generators for simplicity. https://github.com/VDBWRAIR/ngs_mapper/blob/410455741d5849fdad517c41174320adb8b38fc7/ngs_mapper/nfilter.py#L141-L158