AstraZeneca-NGS / VarDict

VarDict
MIT License
187 stars 62 forks source link

Structural Variants #157

Open GorgonVZ opened 3 years ago

GorgonVZ commented 3 years ago

Dear Polina, I am using vardict 1.8.0 with the following command: _perl vardict.pl -L 1000 -w 400 -W 100 -O 30 --adaptor AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA -G hg38.fa -b sorted.bam Test38.bed| teststrandbias.R | var2vcf_valid.pl_ and I get unexpected results regarding the structural variant calling. The Data I use is generated by a custom Baitenrichment-Panel from Twist-Bioscience, is adapterclipped by trimmomatic, sequenced on a MiSeq Machine in 2x 200 PE Mode and has a Library-size of about 400bp. The strange thing I observe is, that all the SV's I find have exactly same length of 109bp and are not visible in the mapping (visual inspection by igv). For some individuals I also have datasets of different library size and sequencing length (2x75PE e.g.) and surprisingly within the short-read dataset these SV's are missing. My guess is, that my issue has something to do with insert-read-through and remaining adaptersequences or incompatibility with trimmomatics headcrop mode (trimming of 5 prime ends of reads, leading to uncommon read orientation) 5'--------->3' 3'<-----------5' Attached is a resulting VCF file showing the SV-DUP I'm talking of and a corresponing screenshot from igv. By the way I also tried to adjust Insert-size Parameters (-W/-w) and minimum SV size (-L) over a broad range and could not see any differences in results. Normally I would expect that even with default settings -L 500 I shouldn't get these strange SV's with a much smaller size of 109bp.

Thanks in advance for any advice!

Best regards, Gorgon IGV

fileformat=VCFv4.3

source=VarDict_v1.8.0

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO= 1 indicates MSI">

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER== 5.25, thus likely false positive">

FILTER=

FILTER=

FILTER=

FILTER==14)">

FILTER=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 001

chr17 61780163 . T 264 PASS SAMPLE=001;TYPE=DUP;DP=130;END=61780529;VD=130;AF=1;BIAS=0:2;REFBIAS=0:0;VARBIAS=15:115;PMEAN=88.6;PSTD=1;QUAL=37.7;QSTD=1;SBF=1;ODDRATIO=0;MQ=41.9;SN=260;HIAF=1.0000;ADJAF=1;SHIFT3=0;MSI=0;MSILEN=0;NM=1.1;HICNT=130;HICOV=130;LSEQ=TGTTGAATTTCCTACCAAGA;RSEQ=CCAGCCTGGGCAATATGGTG;DUPRATE=0;SVTYPE=DUP;SVLEN=366;SPLITREAD=15;SPANPAIR=115 GT:DP:VD:AD:AF:RD:ALD 1/1:130:130:0,130:1:0,0:15,115 chr17 61780238 . C CA 206 PASS SAMPLE=001;TYPE=Insertion;DP=268;END=61780238;VD=48;AF=0.1791;BIAS=2:2;REFBIAS=107:90;VARBIAS=24:24;PMEAN=56.1;PSTD=1;QUAL=37;QSTD=1;SBF=0.63041;ODDRATIO=1.18804;MQ=42;SN=47;HIAF=0.2527;ADJAF=0;SHIFT3=11;MSI=12;MSILEN=1;NM=2.2;HICNT=47;HICOV=186;LSEQ=TAGAAACACTGAAGGCCTTC;RSEQ=AAAAAAAAAAACAACAACTA;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/1:268:48:197,48:0.1791:107,90:24,24 chr17 61780402 . C T 110 PASS SAMPLE=001;TYPE=SNV;DP=395;END=61780402;VD=9;AF=0.0228;BIAS=2:2;REFBIAS=157:228;VARBIAS=5:4;PMEAN=51;PSTD=1;QUAL=35;QSTD=1;SBF=0.49678;ODDRATIO=1.81241504304486;MQ=42;SN=8;HIAF=0.0206;ADJAF=0;SHIFT3=0;MSI=1;MSILEN=1;NM=3.2;HICNT=8;HICOV=388;LSEQ=CCATTAATATCTGAAAAGGC;RSEQ=TAAAAGAAAACAACATTAGA;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/1:395:9:385,9:0.0228:157,228:5,4 chr17 61780448 . C G 293 PASS SAMPLE=001;TYPE=SNV;DP=234;END=61780448;VD=233;AF=0.9957;BIAS=0:2;REFBIAS=0:0;VARBIAS=87:146;PMEAN=40.9;PSTD=1;QUAL=37.3;QSTD=1;SBF=1;ODDRATIO=0;MQ=41.5;SN=115.5;HIAF=1.0000;ADJAF=0.0726;SHIFT3=0;MSI=1;MSILEN=1;NM=1.6;HICNT=231;HICOV=231;LSEQ=AAAATTATCTTTAGAAGAGG;RSEQ=TGGGCAAAGTGGCTCACACC;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 1/1:234:233:0,233:0.9957:0,0:87,146 chr17 61793558 . T C 76 f0.02;pSTD SAMPLE=001;TYPE=SNV;DP=322;END=61793558;VD=4;AF=0.0124;BIAS=2:2;REFBIAS=194:124;VARBIAS=2:2;PMEAN=58;PSTD=0;QUAL=38;QSTD=0;SBF=0.64582;ODDRATIO=1.56222;MQ=42;SN=8;HIAF=0.0125;ADJAF=0;SHIFT3=0;MSI=2;MSILEN=1;NM=1.0;HICNT=4;HICOV=320;LSEQ=CACGACTAAATCACTTCTAA;RSEQ=TCACTAAATACGTTTCACAG;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:322:4:318,4:0.0124:194,124:2,2 chr17 61799335 . G A 58 f0.02 SAMPLE=001;TYPE=SNV;DP=290;END=61799335;VD=3;AF=0.0103;BIAS=2:2;REFBIAS=110:177;VARBIAS=1:2;PMEAN=33.3;PSTD=1;QUAL=36.7;QSTD=1;SBF=1;ODDRATIO=1.242;MQ=42;SN=6;HIAF=0.0105;ADJAF=0.0034;SHIFT3=1;MSI=1;MSILEN=1;NM=1.3;HICNT=3;HICOV=287;LSEQ=ATTTTCTTGTAAAACATTTG;RSEQ=CAAAATAGATTTAACAACAG;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:290:3:287,3:0.0103:110,177:1,2 chr17 61808418 . A T 38 f0.02 SAMPLE=001;TYPE=SNV;DP=169;END=61808418;VD=2;AF=0.0118;BIAS=2:2;REFBIAS=108:59;VARBIAS=1:1;PMEAN=57.5;PSTD=1;QUAL=38;QSTD=0;SBF=1;ODDRATIO=1.82347;MQ=42;SN=4;HIAF=0.0118;ADJAF=0;SHIFT3=0;MSI=3;MSILEN=1;NM=1.0;HICNT=2;HICOV=169;LSEQ=AAACACATACTGAGTAATTT;RSEQ=AATATTTTCAGCCTTATTTT;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:169:2:167,2:0.0118:108,59:1,1 chr17 61808425 . T A 61 PASS SAMPLE=001;TYPE=SNV;DP=178;END=61808425;VD=4;AF=0.0225;BIAS=2:2;REFBIAS=107:65;VARBIAS=3:1;PMEAN=48.2;PSTD=1;QUAL=30.8;QSTD=1;SBF=1;ODDRATIO=1.81679444787617;MQ=42;SN=3;HIAF=0.0171;ADJAF=0;SHIFT3=0;MSI=4;MSILEN=1;NM=2.0;HICNT=3;HICOV=175;LSEQ=TACTGAGTAATTTAAATATT;RSEQ=TCAGCCTTATTTTTTCTCTA;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/1:178:4:172,4:0.0225:107,65:3,1 chr17 61808452 . A G 40 f0.02 SAMPLE=001;TYPE=SNV;DP=271;END=61808452;VD=3;AF=0.0111;BIAS=2:2;REFBIAS=162:106;VARBIAS=2:1;PMEAN=34;PSTD=1;QUAL=25.7;QSTD=1;SBF=1;ODDRATIO=1.30737753140975;MQ=42;SN=2;HIAF=0.0075;ADJAF=0;SHIFT3=0;MSI=4;MSILEN=1;NM=3.7;HICNT=2;HICOV=268;LSEQ=TTATTTTTTCTCTAACACAA;RSEQ=ATAACTTTACTCACGTTTTT;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:271:3:268,3:0.0111:162,106:2,1 chr17 61849272 . T TA 76 f0.02 SAMPLE=001;TYPE=Insertion;DP=340;END=61849272;VD=4;AF=0.0118;BIAS=2:2;REFBIAS=133:192;VARBIAS=2:2;PMEAN=30.5;PSTD=1;QUAL=38;QSTD=0;SBF=1;ODDRATIO=1.44194027483382;MQ=42;SN=8;HIAF=0.0124;ADJAF=0;SHIFT3=8;MSI=9;MSILEN=1;NM=0;HICNT=4;HICOV=323;LSEQ=GGAGTCTTATATAAGTAATT;RSEQ=AAAAAAAACAGCATAAATAA;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:340:4:325,4:0.0118:133,192:2,2 chr17 61857242 . A G 75 f0.02 SAMPLE=001;TYPE=SNV;DP=417;END=61857242;VD=5;AF=0.012;BIAS=2:2;REFBIAS=174:236;VARBIAS=3:2;PMEAN=22.4;PSTD=1;QUAL=32.4;QSTD=1;SBF=0.65487;ODDRATIO=2.03099295245446;MQ=42;SN=4;HIAF=0.0097;ADJAF=0;SHIFT3=1;MSI=3;MSILEN=1;NM=1.6;HICNT=4;HICOV=413;LSEQ=GCTGGTTTCCCTAAAAATGA;RSEQ=AGAACATCTATTTATAATAT;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:417:5:410,5:0.012:174,236:3,2 chr17 61861400 . TA T 76 f0.02 SAMPLE=001;TYPE=Deletion;DP=396;END=61861401;VD=4;AF=0.0101;BIAS=2:2;REFBIAS=227:164;VARBIAS=2:2;PMEAN=71.5;PSTD=1;QUAL=38;QSTD=0;SBF=1;ODDRATIO=1.38297;MQ=42;SN=8;HIAF=0.0102;ADJAF=0;SHIFT3=1;MSI=2;MSILEN=1;NM=0;HICNT=4;HICOV=393;LSEQ=TCAATGTACTTTATGGGTCA;RSEQ=AGTATCTATATCTTAATAAA;DUPRATE=0;SPLITREAD=0;SPANPAIR=0 GT:DP:VD:AD:AF:RD:ALD 0/0:396:4:391,4:0.0101:227,164:2,2