fritzsedlazeck / SURVIVOR

Toolset for SV simulation, comparison and filtering
MIT License
354 stars 47 forks source link

SIGSEGV with manta data #72

Closed lindenb closed 5 years ago

lindenb commented 5 years ago

Hi, FYI I got this segfault with some vcf from MANTA and SURVIVOR 1.0.6

(gdb) run  merge jeter.list 100 1 1 1 1 1000 out.vcf
Program received signal SIGSEGV, Segmentation fault.
parse_strands_lumpy (buffer=<optimized out>) at ../src/vcfs/Merge_VCF.cpp:287
287     while (buffer[i] != '\t') {
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7_4.2.x86_64 libgcc-4.8.5-16.el7_4.3.x86_64 libstdc++-4.8.5-16.el7_4.3.x86_64 lustre-client-2.7.21.2-3.10.0_693.1.1.el7.x86_64_g069c467.x86_64 shook-client-1.6.0-1.el7.centos.x86_64
(gdb) bt
#0  parse_strands_lumpy (buffer=<optimized out>) at ../src/vcfs/Merge_VCF.cpp:287
#1  parse_vcf (filename="/ccc/scratch/cont007/fg0073/lindenbp/B00I4CA/results/variants/diploidSV.vcf.gz", min_svs=min_svs@entry=1000)
    at ../src/vcfs/Merge_VCF.cpp:628
#2  0x000000000047af06 in combine_calls_svs (files="jeter.list", max_dist=max_dist@entry=100, min_support=min_support@entry=1, type_save=type_save@entry=1, 
    strand_save=strand_save@entry=1, dynamic_size=dynamic_size@entry=1, min_svs=min_svs@entry=1000, output="out.vcf") at ../src/merge_vcf/combine_svs.cpp:600
#3  0x00000000004c714c in official_interface (argc=<optimized out>, argv=0x7fffffff55e8) at ../src/SURVIVOR.cpp:122
#4  0x00000000004037e9 in main (argc=<optimized out>, argv=<optimized out>) at ../src/SURVIVOR.cpp:319
lindenb commented 5 years ago

checked in https://github.com/fritzsedlazeck/SURVIVOR/blob/master/src/vcfs/Merge_VCF.cpp#L287 , the loop finds no '\t'.

while (buffer[i] != '\t' && buffer[i] !=0) 

fixes the segfault but I don't know how it affects the final output :-)

edit: output is empty (just the vcf header)

edit2:

I changed Minimum size of SVs to be taken into account. to a lower value, but now the output vcf is full of binary characters and NA/values...

(...)     6z}ƫ�x�1�3�!����o�]�@H�$�'zʟV�u2�F��"�鋲��M���25슽i���e���!���
G6e�FYw��9ǚ�mߒӐ�*���P�>�e�{����"L���>|*}1�UK�
                                                      ����e�^��y@=���k�v���Y���p%,dc_��?v�8�1_<i��q��ԝU.Ԙ��ʨj�d��|���iU�jt*�S�5]as��! ���8M��U3iɳsu��)���g��Ѡ��>�k���[|���\���t7c����PiD4
[ܼ  <DEL>   .   PASS    SUPP=1;SUPP_VEC=00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000
0000000000000;SVLEN=-1093;SVTYPE=DEL;SVMETHOD=SURVIVOR1.0.6;CHR2=U�����)�������$9�:�ב3[6��7����ϓ��4�,AQ2A����P���[vC�<L�����(Fs��y�,O*.��;END=1093;CIPOS=0,0;CIEND=0,0;STRANDS=+
+   GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN
:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--
:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    
./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN    ./.:NaN:0:
(...)
lindenb commented 5 years ago

OK I THINK I understand: I think your program cannot read gziiped vcf files. sadly there is no warning about this, it parses everything silently :-)

fritzsedlazeck commented 5 years ago

Yes sorry. There is currently no check for gzip files. It's on my to-do list. Sorry for the debugging. I don't frankly know an easy check. I could parse the last characters from the filename... Thanks Fritz