Griffan / VerifyBamID

VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.
http://griffan.github.io/VerifyBamID/
92 stars 15 forks source link

add segfault handler for debugging #65

Closed Griffan closed 3 months ago

Griffan commented 4 months ago

This PR aims to resolve segfault reported by multiple users in issue https://github.com/Griffan/VerifyBamID/issues/45 To better trace the error without exchanging testing dataset, I added this simple debugging mode to print backtrace at the crash site.

Debugging Mode

If you encounter abnormal errors, e.g. "Segmentation fault", you can try to recompile the build under debugging mode:

cmake .. -DCMAKE_BUILD_TYPE=Debug
make

and then rerun your command line to locate the backtrace message, and then post it to issues page, for example:

Stack trace (most recent call last):
#6    Object "VerifyBamID", at 0x10ad9d822, in main + 466
#5    Object "VerifyBamID", at 0x10ad9b53d, in execute(int, char**) + 8509
#4    Object "VerifyBamID", at 0x10adc70c7, in ContaminationEstimator::OptimizeLLK(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 903
#3    Object "VerifyBamID", at 0x10adc9408, in ContaminationEstimator::OptimizeHeter(AmoebaMinimizer&) + 1560
#2    Object "libsystem_platform.dylib", at 0x7ff8182f0dfc, in _sigtramp + 28
#1    Object "VerifyBamID", at 0x10ad9e3ed, in backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 13
#0    Object "VerifyBamID", at 0x10ad9e456, in backward::SignalHandling::handleSignal(int, __siginfo*, void*) + 70
yfarjoun commented 4 months ago

Thanks @Griffan. Trying this out. will report back.

Griffan commented 4 months ago

Hi @yfarjoun, I have pushed the fix by bypassing these samples with uninformative GT or PL fields. Let me know if it finish the job successfully on your side. Thanks!

yfarjoun commented 3 months ago

it worked (in that it didn't explode..) I am missing many sites, due to the fact that they are multi-allelic. how did you pre process the 1KG vcfs, as they have lots of multiallelic sites as well?

hyunminkang commented 3 months ago

I would recommend any multi-allelic sites as verifyBamID is using biallelic variants. If the reference data is too big, you may decompose the VCF into bi-allelics first and filter variants by missingness.