itmat / Normalization

RNA-Seq normalization and quantification pipeline
https://github.com/itmat/normalization/wiki
GNU General Public License v3.0
10 stars 5 forks source link

runall_sam2mappingstats.pl runs out of memory #19

Closed loganjeverett closed 10 years ago

loganjeverett commented 10 years ago

On larger sam files (>50GB), the sam2mappingstats.pl script runs out of memory on the PMACS cluster (-bsub option). This is because runall_sam2mappingstats.pl is hardcoded to submit jobs to max_mem30, but larger sam files need more memory. runall_sam2mappingstats.pl either needs to always submit to max_mem64 or give an option to select which queue to submit jobs to. Thanks, -Logan

eunjijunekim commented 10 years ago

Thank you for reporting this problem. We'll fix it and let you know when the script is updated. Eun Ji

eunjijunekim commented 10 years ago

Hi Logan, I just updated sam2mappingstats.pl script. The new script shouldn't need as much memory.

Can you please git pull the changes (sam2mappingstats.pl), test it on your large sam file using max_mem30 and let me know if it still runs out of memory or not?

You can run something like this: bsub -J -o test.out -e test.err -q max_mem30 perl /path/to/Normalization/norm_scripts/sam2mappingstats.pl

Eun Ji

loganjeverett commented 10 years ago

Hi Eun Ji,

Thanks for looking into this. It looks like it's still running out of memory on the max_mem30 queue. Based on the log file, it looks like it's still crashing early in the processing of U IDs. The last line of output in the log file was:

processed 7000000 U IDs Thu Apr 24 10:31:37 EDT 2014

But when I ran this script using max_mem64 as the queue, it got through at least 165000000 U IDs based on the log file (and completed successfully). -Logan