AstraZeneca-NGS / VarDictJava

VarDict Java port
MIT License
128 stars 57 forks source link

AF > 1 with DP < VD for some Long deletions #386

Open alexander-e-f-smith opened 1 year ago

alexander-e-f-smith commented 1 year ago

Hi I have observed on occasion an indel variant can have a greater variant depth (VD) than DP leading to and AF >1. So far I have caught this for 2 large deletions. (1 exampled below). Is this expected behaviour. Is this due to how/where the depth is counted relative to the position of the deletion? A value >1 may unfortunately crash downstream work (plotting for example.) best A

8 91777247 . CTCAGGCTTCATGCTTGGAGATTGAGATTTCACAATCTCTGGTTCACTGCCTAGAAAACCAGGTTCCATATCTTAGGGATACAACCTCTCTAACTGAAATGTCAGGCTCTGTGTCTTGAGTGCTAAGCTTCTCACATAGAGGCAATGTTCCGTGCTTGAGGTTTAGGATTTTACAAATTGGATCAATGCTTGTAAGGCCAGGTTACATTCTTAGAGATTAAGATTCCACCTTTATAGTACCATGTTCCATTCTCTATGTAGGGTCAGGTTCCATGTATGGAGGTAGAGTTTCCTCCCATGAGATGCCAGGTTCTTTGTTTGTAGGGCTAATTTTCATTTTCAAAGATTGAAATTTTACACATAGATATCAGGTCCCATACTTTTAATGTTAGGCTTTATGCATAGTATCATGGGTACAATGCCTAGGAAATCAGTTACCATGACTTGGCCGAAGACTATGATTGAAATGCCAAGTTCTGTGTCTGAAATATAAGGCTCCACACATGGGAGGCCAAGTTCCATATATGTAGGTTGAGGTTCTACCATGGAGTGTCATTTTCCATGGTAGATTTTGGGTACCCTCATAAGGAGTGTCAATTTCATCCTTGTAGACCTAATTCCATGCTTAAAGAATAGGTATTTAAAATGTGGAGTTCCAGGTTACATGCTTTTAGAGCAAGGTTCTATGCACAGAGATTGAAATTCCACAAGTTGATTG C 137 PASS SAMPLE=thisSample;TYPE=Deletion;DP=8;VD=17;AF=2.125;BIAS=2:2;REFBIAS=4:1;VARBIAS=3:13;PMEAN=33.6;PSTD=1;QUAL=33.7;QSTD=1;SBF=0.02511;ODDRATIO=14.3943;MQ=60;SN=34;HIAF=0.7727;ADJAF=0.625;SHIFT3=5;MSI=1;MSILEN=1;NM=1.6;HICNT=17;HICOV=22;LSEQ=CCATGTACCATTCTTGTAGG;RSEQ=TCAGGATTCATGCTTGGAGA;DUPRATE=0;SPLITREAD=2;SPANPAIR=16 GT:DP:VD:AD:AF:RD:ALD 1/1:8:17:5,17:2.125:4,1:3,13

alexander-e-f-smith commented 1 year ago

Hi again I have now observed the phenomenon of having a VD greater than the DP (giving an AF >1) a number of times. Another example is pasted in below. Is this a fixable bug/issue? 7 55258822 . AGGCAGCCAGGGAGGTGGGGAGGGTGGTGTCTTCTAAAAGCATTTTCAGTATCCATGTGGTTTCAGTAATAATAATAATAATAAACCAGTGAAAAGTAAAACAGGACAAAAATCTTCATAGGCAGTGAACCATATCAGAGAGTCCAAGAAAGCACAATGAGAGTGTGGCTTAAAAACCCTGAACGACATTCCTTTGCACCAGCTTGGTGAGGAGGGCATGGTCCCCGCCACCCCCCACCCCCACTTTGCAGATAAACCACATGCAGGAAGGTCAGCCTGGCAAGTCCAGTAAGTTCAAGCCCAGGTCTCAACTGGGCAGCAGAGCTCCTGCTCTTCTTTGTCCTCATATACGAGCACCTCTGGACTTAAAACTTGAGGAACTGGATGGAGAAAAGTTAATGGTCAGCAGCGGGTTACATCTTCTTTCATGCGCCTTTCCATTCTTTGGATCAGTAGTCACTAACGTTCGCCAGCCATAAGTCCTCGACGTGGAGAGGCTCAGAGCCTGGCATGAACATGACCCTGAATTCGGATGCAGAGCTTCTTCCCATGATGATCTGTCCCTCACAGCAGGGTCTTCTCTGTTTCAGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCACCGCGACCT A 228 PASSSAMPLE=B0522-23FFPE-DNA;TYPE=Deletion;DP=2;VD=204;AF=102;BIAS=2:1;REFBIAS=1:1;VARBIAS=204:0;PMEAN=8.4;PSTD=1;QUAL=29.8;QSTD=1;SBF=0.00971;ODDRATIO=0;MQ=60;SN=4.075;HIAF=0.9879;ADJAF=2.5;SHIFT3=10;MSI=1;MSILEN=1;NM=1.6;HICNT=163;HICOV=165;LSEQ=CAATTGCAGCGAGATTGTGG;RSEQ=GGCAGCCAGGAACGTACTGG;DUPRATE=0;SPLITREAD=5;SPANPAIR=205 GT:DP:VD:AD:AF:RD:ALD 1/1:2:204:2,204:102:1,1:204,0

karlestira commented 8 months ago

same problem when call longindels:

the sam file lile this (4 line, r1 + r2 + r1 supplementary + r2 supplementary): sample 163 chr4 55593418 60 85M65S = 55593635 298 GCTGATTGGTTTCGTAATCGTAGCTGGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGGTAACCATTTATGTTTACATAGACCCAACACAACTTCCTTATGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAG GGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGFGGGFGGGGGGGGGGGGGFGGGEG NM:i:0 MD:Z:85 MC:Z:69S81M AS:i:85 XS:i:0 SA:Z:chr4,55593635,+,81S69M,60,0; RG:Z:sample sample 2131 chr4 55593430 60 73M77S = 55593418 -85 CGTAATCGTAGCTGGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGGTAACCATTTATGTTTACATAGACCCAACACAACTTCCTTATGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTTGGTCAGTA FGGGGEFGGFFGGGGGGGGGGGGGGFFGGGGGGFFGGFFFCGFFGFCFGGGGGGGGGFFGGFGGGEGGGGGFGGGGGGGGGFGFFFGFFFEGGEGGGGFGGGGGGFGBGGGGGGGGGGFFFGFGGGGGFGGGFFFGGGGGGGGGFGGGGG NM:i:0 MD:Z:73 MC:Z:85M65S AS:i:73 XS:i:0 SA:Z:chr4,55593635,-,69S81M,60,0; sample 2211 chr4 55593635 60 81S69M = 55593635 81 GCTGATTGGTTTCGTAATCGTAGCTGGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGGTAACCATTTATGTTTACATAGACCCAACACAACTTCCTTATGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAG GGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGFGGGFGGGGGGGGGGGGGFGGGEG NM:i:0 MD:Z:69 MC:Z:69S81M AS:i:69 XS:i:0 SA:Z:chr4,55593418,+,85M65S,60,0; sample 83 chr4 55593635 60 69S81M = 55593418 -298 CGTAATCGTAGCTGGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGGTAACCATTTATGTTTACATAGACCCAACACAACTTCCTTATGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTTGGTCAGTA FGGGGEFGGFFGGGGGGGGGGGGGGFFGGGGGGFFGGFFFCGFFGFCFGGGGGGGGGFFGGFGGGEGGGGGFGGGGGGGGGFGFFFGFFFEGGEGGGGFGGGGGGFGBGGGGGGGGGGFFFGFGGGGGFGGGFFFGGGGGGGGGFGGGGG NM:i:0 MD:Z:81 MC:Z:85M65S AS:i:81 XS:i:0 SA:Z:chr4,55593430,-,73M77S,60,0;

vardict give a vcf like this:

chr4 55593498 . TTTATTTGTTCTCTCTCCAGAGTGCTCTAATGACTGAGACAATAATTATTAAAAGGTGATCTATTTTTCCCTTTCTCCCCACAGAAACCCATGTATGAAGTACAGTGGAAGGTTGTTGAGGAGATAAATGGAAACAA T 75 PASS STATUS=StrongSomatic;SAMPLE=PTIS66121877F1D1L1;TYPE=Deletion;DP=2;VD=4;AF=1;SHIFT3=4;MSI=3.000;MSILEN=1;SSF=0;SOR=0;LSEQ=CAAATATTTACAGGTAACCA;RSEQ=TTATGTTTACATAGACCCAA GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 1/1:2:4:2,2:0,0:0,4:1:0,2:73:1:37.8:1:1:0:60:8:1:1:0 0/0:798:0:0,0:412,386:798,0:0:2,0:39:1:36.9:1:1:0:60:113:1:0:0.1

notice it says DP=2 and VD=4.

the DP cal maybe wrong.