fritzsedlazeck / Sniffles

Structural variation caller using third generation sequencing
Other
561 stars 95 forks source link

SUPP value when SV calling for a population #249

Closed tleonardi closed 3 years ago

tleonardi commented 3 years ago

Hi @fritzsedlazeck, first of all thanks for the great set of tools! I've been following your Wiki page on SV calling for a population and I've run into an issue.

The final merged GTF contains the genotypes and allele frequencies for each sample, as it should. However, it appears that the SUPP and SUPP_VEC fields also include 0/0 calls, while based on the SURVIVOR documentation they shouldn't.

Here is an example where I would expect SUPP=2 and SUPP_VEC=0110. The record is broken into multiple lines to improve legibility:

1       2994973 90      A[...]A  N       .       PASS    
SUPP=4;SUPP_VEC=1111;SVLEN=-314;SVTYPE=DEL;SVMETHOD=SURVIVOR1.0.7;CHR2=1;END=2995287;CIPOS=0,0;CIEND=0,0;STRANDS=+-     
GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO     
0/0:NA:314:4,1:+-:.:DEL:90:NA:NA:1_2994973-1_2995287        
0/1:NA:314:5,3:+-:.:DEL:90:A[...]A:N:1_2994973-1_2995287 
0/1:NA:314:6,3:+-:.:DEL:90:A[...]A:N:1_2994973-1_2995287 
0/0:NA:314:3,0:+-:.:DEL:90:NA:NA:1_2994973-1_2995287

Did I misunderstand the meaning of SUPP and SUPP_VEC?

Thanks in advance for your help!

fritzsedlazeck commented 3 years ago

Yes this is sadly a problem. SURVIVOR is a tool not only designed for this but many other things. As such some methods report 0/0 as found vs. ./. as not found. When genotyping that ./. should not remain anymore and thus SURVIVOR thinks it has been found... Thus there is this inconvenience for SUPP and SUPP_VEC is not reflecting this. However, I dont see a clear solution at the moment, because otherwise, people will complain about other methods.

I hope thats bearable. thanks Fritz

fritzsedlazeck commented 3 years ago

Oh and sorry for the late reply!