AstraZeneca-NGS / VarDict

VarDict
MIT License
187 stars 62 forks source link

failed in discover one single mutation #180

Closed lmanchon closed 1 year ago

lmanchon commented 1 year ago

--Hi,

i don't know why this record is discarded: chr21 43104346 . G T 273 Q10 SAMPLE=SAD190004_S5;TYPE=SNV;DP=636;END=43104346;VD=289;AF=0.4544;BIAS=2:2;REFBIAS=170:177;VARBIAS=143:146;PMEAN=39.1;PSTD=1;QUAL=33.5;QSTD=1;SBF=0.93657;ODDRATIO=1.01975260801729;MQ=0;SN=35.125;HIAF=0.4554;ADJAF=0.0079;SHIFT3=0;MSI=2;MSILEN=2;NM=1.1;HICNT=281;HICOV=617;LSEQ=TCGGTTTATTGTGCAACCGA;RSEQ=AGCACCTGTCTCCATGACGA;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/1:636:289:347,289:0.4544:170,177:143,146

which parameter i need to adjust ?

thank you --

PolinaBevad commented 1 year ago

Hi @lmanchon , these is MQ=0, that means that mapping quality of this variant is zero - so the reads supporting the variant should have MAPQ=0 as well and seems are not mapped correctly to the reference. Better to check with genome browser, but I believe it was filtered out correctly (FILTER field is Q10 for this case).

lmanchon commented 1 year ago

seems to be well mapped with BWA-MEM, see IGV panel here: chr21_43104346

PolinaBevad commented 1 year ago

Maybe those are secondary alignments? Do you have XA tags in these reads? Just show any read supporting the variant. We filter secondary alignments, but probably they were included in tags instead of flags (-F 0x504).

lmanchon commented 1 year ago

this the corresponding sam lines: samtools view mybam.bam chr21:43104255-43104428 | grep "43104346" M02792:126:000000000-K9RHL:1:1114:12346:24983 163 chr21 43104253 0 147M = 43104346 240 TTAACTGTCTTTGAAAAGAACATGAAGTTTTTATAATTTACATGAAAAAAAGGCAAACAAACCTGGCTAAACGTCGGTTTATTGTGCAACCGAGAGCACCTGTCTCCATGACGACATGCTCCAATTTTGAAATAAAATGAACAGTTG BEDHBFFDGFEEDEGGHHFIDECKFJGEFGFEBEBJEFEBDEEKGJJJJHGHDIEJJEEJJDEIKIFGBJJDBEHBHDFFBEEKFGFFJDEBFHFHGEFHHKEHHHHEELFDBFFEDLFHIHEJEFFFKFJKFBJJJDKGKFEGEFD XA:Z:chr21,+6495931,147M,0; ZA:Z:CGCT ZB:Z:CCGT BC:Z:5 MD:Z:147 RG:Z:SAD190004_S5 NM:i:0 AS:i:147 XS:i:147 QX:Z:CCB ABC RX:Z:CGC-CCG xc:i:1 zd:Z: xm:i:1 M02792:126:000000000-K9RHL:1:1114:12346:24983 83 chr21 43104346 0 147M = 43104253 -240 GAGCACCTGTCTCCATGACGACATGCTCCAATTTTGAAATAAAATGAACAGTTGACTCTGTAAGGGAAAATGAGAGCTGATTATTTTGCTGGGAAGATATCAAACACATGGAATATGTCAGCAGCATGACATACACTATCAAATTAC HHFLEHGEFFFGIKFEGDEHEKEEFGGIKGEIJJEHEFEBFFGEEHFFKHEJEHEGGGEDBGHHHHFGGEFHHGIFHFHEJCEJJJEFHFIHHFIHEBEHKGFFLEKEEHHGEBEEEHLHGLHFLEEHEDDCDCDHCDEHDECHCCD XA:Z:chr21,-6496024,147M,0; ZA:Z:CGCT ZB:Z:CCGT BC:Z:5 MD:Z:147 RG:Z:SAD190004_S5 NM:i:0 AS:i:147 XS:i:147 QX:Z:CCB ABC RX:Z:CGC-CCG xc:i:1 zd:Z: xm:i:1

PolinaBevad commented 1 year ago

Those are secondary alignments: there is XA:Z tag that shows where was another location in genome where the read can be mapped. And this one will have MAPQ 0. Usually these should be also put to SAM flags (i.e. 163/83 for those reads) but for some reason it wasn't - probably BWA setup. But anyway - this variant is likely to be on another position in genome, so it was filtered out here.

lmanchon commented 1 year ago

okay thank you for your help.

PolinaBevad commented 1 year ago

Glad to help! :)