Seg fault in initPriors during genotyping

chapmanb commented 9 years ago

Zev; I'm running wham on some whole genome data and getting consistent segmentation faults on the genotyping step. The backtrace from the core dumps looks like:

Core was generated by `WHAM-GRAPHENING -b
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000422a7c in alignHMM::initPriors(std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&) ()
#1  0x0000000000418341 in genotype(std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, breakpoints*) ()
#2  0x0000000000419157 in main._omp_fn.5 ()
#3  0x000000000040539d in main ()

I'm a bit stuck about how to debug. Do you have any suggestions about what might be causing this? Thanks much.

zeeev commented 9 years ago

Yikes! That's no good. Would it be hard to create a small bam file that generates this error?

chapmanb commented 9 years ago

Zev -- I can definitely work on that. I'll try to compile a version of wham with debug information and see if I can identify where it's happening. Reading the code there, it looks like it's looping through the BAM file updating priors for the HMMs, is that right? Are there assumptions of consistent read length for the prior matrix? At this point I'm just trying to think of practical ways to reduce the problem for a proper report. Thanks again.

zeeev commented 9 years ago

The first call to the class that is causing the segfault is here: https://github.com/jewmanchue/wham/blob/master/src/bin/graph-er.cpp#L4021

Its calling this function: https://github.com/jewmanchue/wham/blob/master/src/lib/alignHMM.h#L90

This loops over the read and fills the probability matrix.

When you are in gdb if you type "frame 0" it should take you to the exact line.

Without knowing the exact line, I would guess that there is a variable read length? I only initialize the class once and it is assuming that the read lengths are the same.

chapmanb commented 8 years ago

Zev -- that's right on, there are variable length reads in these datasets. I couldn't get gdb to give me the exact line with these core dumps but it sounds like we've identified the underlying issue. Is it possible to support variable read lengths here or should I be skipping genotyping in these cases?

zeeev commented 8 years ago

I don't have a test set for trimmed reads, but the changes i pushed should work. They don't break anything for NA12878.

Let me know if this opens up any other problems.

--Zev

zeeev commented 8 years ago

Was the issue resolved for you?

P.S. - paper is out.

chapmanb commented 8 years ago

Zev -- congrats on the paper. It looks great, and thank you for citing bcbio.

Sorry I haven't followed up on this. Currently your trimming fix works great but I'm stuck with a sampling issue on hg19 and hadn't yet worked around this in bcbio because I wasn't sure of the right way. The issue with hg19 is that chrM is first (relative to GRCh37, where it comes after X and Y) so using the -u 20 workaround also sampled it, leading to a biased distribution of insert sizes and depth.

Is the best approach to use -e all,of,the,non,standard,chromosomes? This is do-able but does get a little unwieldy with hg38 where you have a ton of alts. Do you think it would make sense to have a -i argument that specifies the chromosomes to sample for the size distribution since that should be smaller than all the excluded chromosomes.

Congrats again on the paper and thanks for all the help with getting WHAM going.

zeeev commented 8 years ago

I will implement -i.

I wouldn't worry too much about a biased insert size distribution. For humans -u 20 is very safe. I've tested this for several human populations.

--Zev

zeeev commented 8 years ago

Brad,

I've added the include flag (-c). You can use -e and -c together. If something is on the -e list it will not be included in the -c list (although it prints to STDERR).

Let me know if you run into any other problems. Thanks, as always, for your helpful suggestions.

zeeev / wham

Seg fault in initPriors during genotyping #17