Closed johanneskoester closed 6 years ago
I'm quite sure it is a memory leak. Memory usage is growing monotonically.
I take that back. I have tested qtip-parse with valgrind. There is no memory leak:
==11741==
==11741== HEAP SUMMARY:
==11741== in use at exit: 0 bytes in 0 blocks
==11741== total heap usage: 252 allocs, 252 frees, 185,753 bytes allocated
==11741==
==11741== All heap blocks were freed -- no leaks are possible
==11741==
==11741== For counts of detected and suppressed errors, rerun with: -v
==11741== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Maybe it has just been something temporal on my server. I will close this for now.
Ok, I had a closer look. According to valgrind, there are no memleaks. However, qtip-parse is clearly killed because it uses the entire system memory (250GB!!):
[62697.863753] Out of memory: Kill process 27262 (qtip-parse) score 981 or sacrifice child
[62697.863798] Killed process 27262 (qtip-parse) total-vm:267307312kB, anon-rss:258564188kB, file-rss:0kB, shmem-rss:0kB
This is quite urgent for me, and I am happy to fix the code if you could give me a pointer to where that much memory is used in qtip-parse.
I think the template vectors are the problem. With huge bam files, these are a lot. @BenLangmead I see two solutions:
Seems like this can be already controlled with --input-model-size? I will try this out.
This is very odd. qtip-parse
was designed to avoid having any large in-memory data structures. While it's true that the number of tandem reads has to grow with the input size, that should only result in increased disk footprint, not memory footprint (and even that increase is sublinear in # of input reads). All the containers in qtip-parse
, the input-model reservoir samplers being the most significant, have hard ceilings set to low values by default. I've done many many tests and not seen this behavior. I'll let you know if I think of anything, but any details about which data structures are responsible would be helpful. E.g. using a debugger to catch memory allocation failures is one quick thing to try.
I think I have discovered the problem. The problem is the parameter input_model_size
.
input_model_size
remains at the maximum (far too large) size if the argument is not set at the command line. qtip
should always set input-model-size
when invoking qtip-parse
. If you're running qtip-parse
directly, then that's the problem. It's only designed to be run directly by qtip
.
No, I did not run it directly. As I have outlined above, it seems like the argument is not passed through to qtip-parse.
Could you send the full output also using --verbose
when calling qtip
?
It looks like qtip
is failing to pass the "passthrough args" to the utility programs, which might speak to a failure in _get_passthrough_args
Here comes the entire log. No --verbose though, because I cannot computationally afford to rerun this again. simulated.tumor.hg18.log
Ah. The message(s) I was hoping to see (or not see) are printed only when DEBUG level logging is enabled via --verbose
: https://github.com/BenLangmead/qtip/blob/master/qtip#L363. I will make a note to try to reproduce this locally, but might not be till semester/teaching finishes in a couple weeks. If you ever have a chance to do a less expensive run (maybe on subset of data), then please try with --verbose
and send.
As an example, here's what the INFO message about running qtip-parse
looks like when I run it locally on the test data:
11/28/17-08:07:00:INFO: running "/Users/langmead/git/qtip/qtip-parse ifs -- wiggle 30 input-model-size 30000 max-allowed-fraglen 100000 sim-factor 45.0 sim-function sqrt sim-unp-min 30000 sim-conc-min 30000 sim-disc-min 10000 sim-bad-end-min 10000 seed 1712975108 -- full_e2e/input.sam -- lambda_virus.fa -- full_e2e/input_intermediates -- full_e2e/tandem_intermediates"
Everything between the first and second --
is missing for your run. Those are the "pass-through" arguments, where qtip
is passing along some of its args.
Ok, you have convinced me :-). Here comes the output for a very small BAM: test.normal.hg18.log
The corresponding command was (with --verbose):
qtip --bwa-exe 'resources/bwa mem -Y -R "@RG\\tID:normal\\tSM:normal" -t 1' --output-directory mapped-qtip/test.normal.hg18 --temp-directory mapped-qtip --verbose --aligner bwa-mem --m1 reads/test.normal.1.fastq --m2 reads/test.normal.2.fastq --index index/hg18/genome --ref /export/scratch2/koster/data/ref/hg18.fasta
Thanks for the fix!
I managed to run qtip with BWA (see my PR). But now I get this error:
The BAMs are looking ok (in terms of ZT:Z). However, exit code 9 usually means killing due to out of memory problems. Do you have numbers how much memory qtip-parse would need? Or might it have a memory leak?