tjparnell / biotoolbox

Tools for querying and analysis of genomic data
http://tjparnell.github.io/biotoolbox/
Artistic License 2.0
27 stars 16 forks source link

Memory Usage with bam2wig.pl #4

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi Tim,

What steps will reproduce the problem?

I'm running bam2wig.pl to try and convert .bam files to .wig/.bw files in one 
step.  I've had success using the quick --coverage option, but I run into 
problems when I create bins rather than step at single bp resolution.  It seems 
that enabling the --bin option balloons my memory usage such that it uses all 
or nearly all memory available on my lab's server (~24GB) and the run typically 
fails.  I've noticed that the process_coverage subroutines appear to have a 
"dump" function that doesn't seem to be present in the process_alignment 
subroutine used when the --bin option is activated.  Could this have something 
to do with it?  Here is an example of a call that failed:

perl ~/source/biotoolbox/scripts/bam2wig.pl --in some.bam --position start 
--bin 50 --bw

 This program will convert bam alignments to enumerated wig data
 recording start positions
 Forking into 2 children for parallel conversion
Out of memory!

What version of the product are you using? On what operating system?

I'm working on a linux system but have had similar problems using MacOSX.
------
jeff@argus:~$ perl ~/source/biotoolbox/scripts/bam2wig.pl --version

 This program will convert bam alignments to enumerated wig data
 Biotoolbox script bam2wig.pl, version 1.12.1

Thanks for your help!

Jeff

Original issue reported on code.google.com by jeff.m.a...@gmail.com on 21 Jul 2013 at 10:39

GoogleCodeExporter commented 9 years ago
Thank you for bringing this to my attention. There was a problem with deleting 
positions in memory after writing them, leading to the excessive memory usage. 
I have submitted a new version of bam2wig.pl (version 1.12.3)  that fixes this 
issue. You can check out SVN release 645 to obtain it.

Original comment by parnell...@gmail.com on 24 Jul 2013 at 4:07

GoogleCodeExporter commented 9 years ago
Hi Tim,

This fixed the problem, but now I get errors when trying to convert the wiggle 
to a bigWig.  It looks like the script isn't recognizing the end of the 
chromosome and is counting many additional bins beyond the chromosome.

What steps will reproduce the problem?
perl ~/source/biotoolbox/scripts/bam2wig.pl --in input.bam --shift --rpm --out 
outputbam_25bin --bw --bin 25

Original comment by jeff.m.a...@gmail.com on 31 Jul 2013 at 6:27