songbowang125 / SVision-pro

GNU General Public License v3.0
33 stars 3 forks source link

Missing information in VCF output #17

Open wdyyy opened 5 days ago

wdyyy commented 5 days ago

In germline mode, there is no DR or DV field in the sample information, while the description of them occurred in the comment part

chr6    32648276    7184    N   CSV .   PASS    END=32652372;SVLEN=4194;SVTYPE=INS+DEL;SUPPORT=5;VAF=0.45;BKPS=INS_99_chr6_32648276_32648276_32648276,DEL_4095_chr6_32648277_32652372_32648277;RNAMES=m84128_231213_025847_s4/235995539/ccs,m84128_231213_025847_s4/216664643/ccs,m84128_231213_025847_s4/209456533/ccs,m84128_231213_025847_s4/216204680/ccs,m84128_231213_025847_s4/142149066/ccs GT  0/1
chr6    32687806    7196    N   CSV .   PASS    END=32688616;SVLEN=8125;SVTYPE=INS+DEL;SUPPORT=11;VAF=1;BKPS=INS_7316_chr6_32687806_32687806_32687806,DEL_809_chr6_32687807_32688616_32687807;RNAMES=m84128_231213_025847_s4/140509965/ccs,m84128_231213_025847_s4/111806210/ccs,m84128_231213_025847_s4/86180549/ccs,m84128_231213_025847_s4/154274082/ccs,m84128_231213_025847_s4/173081355/ccs,m84128_231213_025847_s4/165086391/ccs,m84128_231213_025847_s4/158597951/ccs,m84128_231213_025847_s4/68486361/ccs,m84128_231213_025847_s4/124518648/ccs,m84128_231213_025847_s4/127406871/ccs,m84128_231213_025847_s4/258215750/ccs    GT  1/1
chr6    60229935    7348    N   CSV .   PASS    END=60241147;SVLEN=16678;SVTYPE=INS+idDUP+tDUP;SUPPORT=8;VAF=0.36;BKPS=INS_4802_chr6_60241147_60241147_60241147,idDUP_664_chr3_90693816_90694479_60241147,tDUP_11212_chr6_60229935_60241146_60241147;RNAMES=m84128_231213_025847_s4/251003266/ccs,m84128_231213_025847_s4/243664916/ccs,m84128_231213_025847_s4/225841107/ccs,m84128_231213_025847_s4/120389904/ccs,m84128_231213_025847_s4/256903923/ccs,m84128_231213_025847_s4/65536953/ccs,m84128_231213_025847_s4/58922015/ccs,m84128_231213_025847_s4/9637495/ccs GT  0/1
chr6    64295336    7376    N   CSV .   POLY    END=64295519;SVLEN=5971;SVTYPE=INS+DEL;SUPPORT=3;VAF=0.18;BKPS=INS_5789_chr6_64295336_64295336_64295336,DEL_182_chr6_64295337_64295519_64295337;RNAMES=m84128_231213_025847_s4/99095796/ccs,m84128_231213_025847_s4/21039109/ccs,m84128_231213_025847_s4/80216577/ccs   GT  0/0
chr6    89213915    7518    N   CSV .   PASS    END=89214233;SVLEN=1244;SVTYPE=INS+DEL;SUPPORT=7;VAF=0.7;BKPS=INS_927_chr6_89213915_89213915_89213915,DEL_317_chr6_89213916_89214233_89213916;RNAMES=m84128_231213_025847_s4/195235404/ccs,m84128_231213_025847_s4/213390092/ccs,m84128_231213_025847_s4/92733594/ccs,m84128_231213_025847_s4/138219295/ccs,m84128_231213_025847_s4/228327662/ccs,m84128_231213_025847_s4/85265081/ccs,m84128_231213_025847_s4/195168766/ccs    GT  0/1
##INFO=<ID=VAF,Number=1,Type=Float,Description="SV allele frequency in this region">
##INFO=<ID=RNAMES,Number=.,Type=String,Description="SV support read names in this region">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DR,Number=1,Type=String,Description="high-quality reference reads">
##FORMAT=<ID=DV,Number=1,Type=String,Description="high-quality variant reads">

Is this a mistake or a feature? The DV can be converted from the number of RNAMES, but I have no idea how DV was interpreted from the information in the output.

by the way, the arguments I used was --preset hifi --detect_mode germline --min_sv_size 50 --max_sv_size 5000000 --min_mapq 20 --min_supp 3 --max_coverage 500

songbowang125 commented 3 days ago

Thanks for pointing this out. It is indeed we forgot to output the DV and DR field in the 'germline' detection mode.

DV can be converted from the number of RNAMES, or the 'SUPPORT' infomation. The DR can be calculated by the 'DV' and 'VAF' by 'DV / VAF - DV'.

The missing of DV and DR fields doesn't affect the confidence of called SVs, and we will upgrade this in the next version.

songbowang125 commented 1 day ago

Hi, we have upgraded SVision-pro to version v2.2, and the DV and DR values are now outputted in the germline detection mode.