AstraZeneca-NGS / VarDictJava

VarDict Java port
MIT License
128 stars 57 forks source link

Missing target positions when running with pileup (-p) and no local realignment options #330

Open leonghs opened 3 years ago

leonghs commented 3 years ago

I am using VarDict (v1.8.2) to call variants from a targeted-sequencing dataset in single sample mode. I am running VarDict with the the pileup (-p) and no local realignment (-k 0) options in order to generate a VCF output that contains all the target positions in my BED file. The command that I used is as follow:

VarDict -G $referenceGenome -f 0.01 -N $sampleName -b $inBam -c 1 -S 2 -E 3 -g 4 --nosv --deldupvar -z 1 $targetBed -th 8 -p -k 0 -X 0 | teststrandbias.R | var2vcf_valid.pl -N $sampleName -E -f 0.01 -P 0 > $sampleName.vcf

However, I found that some positions specified in the BED file are missing from the resulting VCF. Below are examples from one of the samples that I have analysed where three target positions are missing: 7572991, 7578712 and 7579604 (all on chromosome 17; reference genome: human_g1k_v37_decoy.fa)

Example 1: 7572991 missing

example1

Example 2: 7578712 missing

example2

Example 3: 7579604 missing

example3

All missing positions are in regions with good read coverage. It would be useful to know if this is an expected behaviour of the variant caller and if not, how I can fix this. Thank you.

Montana commented 3 years ago

Instead of screenshots can you give us copy paste? Thanks.

leonghs commented 3 years ago

Hi,

Please find attached the copy-paste version of the examples shown in the thread. Please let me know if you need further information.

Thank you.

Hui Sun

Hui Sun Leong Senior Bioinformatics Scientist From: Montana Mendy @.> Sent: 05 May 2021 14:31 To: AstraZeneca-NGS/VarDictJava @.> Cc: Leong, Hui Sun @.>; Author @.> Subject: Re: [AstraZeneca-NGS/VarDictJava] Missing target positions when running with pileup (-p) and no local realignment options (#330)

Exercise caution: This email originated from an external source. Please do not click links or open attachments unless you trust the sender and know the content is safe.

Instead of screenshots can you give us copy paste? Thanks.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/AstraZeneca-NGS/VarDictJava/issues/330#issuecomment-832688806, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ATNFQDW4FTY3VYRHIFHOJ3TTMFCA5ANCNFSM4Z3V33ZA.

17 7572989 . C T 180 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=SNV;DP=9436;VD=16;AF=0.0017;BIAS=2:2;REFBIAS=4487:4933;VARBIAS=8:8;PMEAN=28.8;PSTD=1;QUAL=45;QSTD=0;SBF=1;ODDRATIO=1.09938434476693;MQ=60;SN=32;HIAF=0.0017;ADJAF=0;SHIFT3=1;MSI=3;MSILEN=1;NM=2.4;HICNT=16;HICOV=9436;LSEQ=ATGGCGGGAGGTAGACTGAC;RSEQ=CTTTTTGGACTTCAGGTGGC;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9436:16:9420,16:0.0017:4487,4933:8,8 17 7572990 . C T 142 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=SNV;DP=9358;VD=9;AF=0.001;BIAS=2:2;REFBIAS=4469:4880;VARBIAS=5:4;PMEAN=40.9;PSTD=1;QUAL=45;QSTD=0;SBF=0.74497;ODDRATIO=1.36490820992288;MQ=60;SN=18;HIAF=0.0010;ADJAF=0;SHIFT3=0;MSI=5;MSILEN=1;NM=3.1;HICNT=9;HICOV=9358;LSEQ=TGGCGGGAGGTAGACTGACC;RSEQ=TTTTTGGACTTCAGGTGGCT;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9358:9:9349,9:0.001:4469,4880:5,4 17 7572992 . T . 0 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=REF;DP=9423;VD=0;AF=0;BIAS=2:0;REFBIAS=4507:4914;VARBIAS=0:0;PMEAN=33.4;PSTD=1;QUAL=45;QSTD=1;SBF=1;ODDRATIO=0;MQ=60;SN=18842;HIAF=1.0000;ADJAF=0;SHIFT3=0;MSI=0;MSILEN=0;NM=1.5;HICNT=9421;HICOV=9421;LSEQ=0;RSEQ=0;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9423:0:9421:0:4507,4914:0,0 17 7572993 . T . 0 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=REF;DP=9431;VD=0;AF=0;BIAS=2:0;REFBIAS=4500:4931;VARBIAS=0:0;PMEAN=33.1;PSTD=1;QUAL=45;QSTD=1;SBF=1;ODDRATIO=0;MQ=60;SN=18862;HIAF=1.0000;ADJAF=0;SHIFT3=0;MSI=0;MSILEN=0;NM=1.5;HICNT=9431;HICOV=9431;LSEQ=0;RSEQ=0;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9431:0:9431:0:4500,4931:0,0 17 7578710 . T . 0 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=REF;DP=1846;VD=0;AF=0;BIAS=2:0;REFBIAS=1164:679;VARBIAS=0:0;PMEAN=18.9;PSTD=1;QUAL=45;QSTD=1;SBF=1;ODDRATIO=0;MQ=59.6;SN=3686;HIAF=0.9989;ADJAF=0;SHIFT3=0;MSI=0;MSILEN=0;NM=2.6;HICNT=1843;HICOV=1845;LSEQ=0;RSEQ=0;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:1846:0:1843:0:1164,679:0,0 17 7578711 . CTTT C 391 PASS SAMPLE=S1591-00045-D-P1;TYPE=Deletion;DP=1767;VD=415;AF=0.2349;BIAS=2:2;REFBIAS=415:422;VARBIAS=309:106;PMEAN=29.4;PSTD=1;QUAL=45;QSTD=1;SBF=0;ODDRATIO=2.96173439165976;MQ=59.5;SN=830;HIAF=0.2349;ADJAF=0;SHIFT3=15;MSI=18;MSILEN=1;NM=3.5;HICNT=415;HICOV=1767;LSEQ=TCTACACCTCAGGAGCTTTT;RSEQ=TTTTTTTTTTTTTTTGAGAT;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/1:1767:415:837,415:0.2349:415,422:309,106 17 7578713 . T . 0 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=REF;DP=1699;VD=0;AF=0;BIAS=2:0;REFBIAS=448:404;VARBIAS=0:0;PMEAN=10.9;PSTD=1;QUAL=45;QSTD=1;SBF=1;ODDRATIO=0;MQ=59.6;SN=1704;HIAF=1.0000;ADJAF=0;SHIFT3=0;MSI=0;MSILEN=0;NM=2.2;HICNT=852;HICOV=852;LSEQ=0;RSEQ=0;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:1699:0:852:0:448,404:0,0 17 7578714 . T TC 0 v2;f0.01;p8;MSI12;LongMSI SAMPLE=S1591-00045-D-P1;TYPE=Insertion;DP=1622;VD=1;AF=6e-04;BIAS=2:0;REFBIAS=492:381;VARBIAS=0:1;PMEAN=4;PSTD=0;QUAL=44;QSTD=0;SBF=0.43707;ODDRATIO=0;MQ=55;SN=2;HIAF=0.0011;ADJAF=0;SHIFT3=0;MSI=18;MSILEN=1;NM=3.0;HICNT=1;HICOV=873;LSEQ=ACACCTCAGGAGCTTTTCTT;RSEQ=TTTTTTTTTTTTTTTGAGAT;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:1622:1:873,1:6e-04:492,381:0,1 17 7579602 . T A 45 f0.01;p8 SAMPLE=S1591-00045-D-P1;TYPE=SNV;DP=9246;VD=2;AF=2e-04;BIAS=2:2;REFBIAS=4307:4936;VARBIAS=1:1;PMEAN=2;PSTD=0;QUAL=45;QSTD=0;SBF=1;ODDRATIO=1.14602672534324;MQ=60;SN=4;HIAF=0.0002;ADJAF=0;SHIFT3=0;MSI=1;MSILEN=1;NM=2.0;HICNT=2;HICOV=9246;LSEQ=CAAGGGGGACTGTAGATGGG;RSEQ=GAAAAGAGCAGTCAGAGGAC;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9246:2:9243,2:2e-04:4307,4936:1,1 17 7579603 . G A 90 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=SNV;DP=9351;VD=4;AF=4e-04;BIAS=2:2;REFBIAS=4397:4950;VARBIAS=1:3;PMEAN=17.5;PSTD=1;QUAL=45;QSTD=0;SBF=0.62756;ODDRATIO=2.66447;MQ=60;SN=8;HIAF=0.0004;ADJAF=0;SHIFT3=0;MSI=4;MSILEN=1;NM=2.8;HICNT=4;HICOV=9351;LSEQ=AAGGGGGACTGTAGATGGGT;RSEQ=AAAAGAGCAGTCAGAGGACC;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9351:4:9347,4:4e-04:4397,4950:1,3 17 7579605 . A . 0 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=REF;DP=9595;VD=0;AF=0;BIAS=2:0;REFBIAS=4293:5301;VARBIAS=0:0;PMEAN=26;PSTD=1;QUAL=45;QSTD=1;SBF=1;ODDRATIO=0;MQ=60;SN=19188;HIAF=1.0000;ADJAF=0;SHIFT3=0;MSI=0;MSILEN=0;NM=1.5;HICNT=9594;HICOV=9594;LSEQ=0;RSEQ=0;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9595:0:9594:0:4293,5301:0,0 17 7579606 . A . 0 f0.01 SAMPLE=S1591-00045-D-P1;TYPE=REF;DP=9590;VD=0;AF=0;BIAS=2:0;REFBIAS=4289:5301;VARBIAS=0:0;PMEAN=25.5;PSTD=1;QUAL=45;QSTD=1;SBF=1;ODDRATIO=0;MQ=60;SN=19180;HIAF=1.0000;ADJAF=0;SHIFT3=0;MSI=0;MSILEN=0;NM=1.5;HICNT=9590;HICOV=9590;LSEQ=0;RSEQ=0;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:9590:0:9590:0:4289,5301:0,0

Montana commented 3 years ago

Going to add code brackets around this, and I'll get back to you.

leonghs commented 3 years ago

Great, many thanks for looking into this.

Montana commented 3 years ago

@leonghs,

Of course, expect something tomorrow/Saturday.

Thank you, Montana Mendy

leonghs commented 3 years ago

Hi, sorry to pester but may I know if there is any update on this issue please?

Thank you.