Open jonn-smith opened 1 year ago
well looks like the offending line is this:
mLogLikelihoodArray = new double[readListSize * numHaplotypes]; //to store results
I don't like that those two numbers are overflowing... numHaplotypes should really be getting bounded to ~256 and readListSize similarly should be capped by the downsampling. This warrants investigation
@jamesemery @jonn-smith So the product of the number of reads and the number of haplotypes is exceeding Integer.MAX_VALUE
(2147483647). If we assume that the number of haplotypes is bounded at 256, then this implies at least ~8,388,607 reads in the region. But --max-assembly-region-size
defaults to 300, with 100 bases of padding on each side, so 500 bases total per region. And --max-reads-per-alignment-start
defaults to 50, giving a theoretical maximum of 25,000 reads per region. If we instead assume that the reads are indeed capped at 25k but the haplotypes are not capped, it would require about ~86k haplotypes to produce overflow.
Not sure what might be different about the --linked-de-bruijn-graph
codepath that could produce such an explosion of either reads or haplotypes....
One again the limits of using natural numbers for array bounds crops up. When will java expand to cover even integer valued arrays? I cant wait for project Euler which introduces complex array bounds as part of the expanded type system.
I'm working on some Plasmodium falciparum callsets in GATK and I have come across a curious error:
This run did not complete successfully - the Exception caused it to fail prematurely.
Previously I had seen HaplotypeCaller run out of memory and fail in almost as much time, so I think this and the OOM error are related. The only difference in invocation was that with the OOM failure, I was running with the default for
--max-reads-per-alignment-start
(50
). This also works just fine with that setting at 15. The failure seems to occur around the same place in the data each time (the end ofchr13
). At that point in the data, there is a very large pileup which is probably instigating this. Additionally, if I remove the--linked-de-bruijn-graph
argument, this runs just fine with the default setting of--max-reads-per-alignment-start
. I have a minimally reproductive dataset that I can share which reproduces the OOM error for sure (I'm 99% sure it reproduces this one as well).For the OOM failures, the final logs from HaplotypeCaller look like this:
Here is my command-line invocation: