Closed Khillo81 closed 5 years ago
Hi @Khillo81, thanks for trying out VarDict. Can you try the same test without the -a
argument? Further, what format is your bed file in? Can you paste a few lines here please
@mjafin Thank you very much for your rapid response. Here are a few lines from the BED file:
1 36931667 36931937 CSF3R.exon.17.line.1.chr1.36931697.36932509_tile_1 1000 - 36931693 36931909
1 36931883 36932143 CSF3R.exon.17.line.1.chr1.36931697.36932509_tile_2 1000 + 36931910 36932117
1 36932089 36932361 CSF3R.exon.17.line.1.chr1.36931697.36932509_tile_3 1000 - 36932115 36932334
1 36932307 36932582 CSF3R.exon.17.line.1.chr1.36931697.36932509_tile_4 1000 + 36932334 36932555
1 36932800 36933050 CSF3R.exon.16.line.2.merged_with.CSF3R.exon.15.line.3.merged_with.CSF3R.exon.14.line.4.chr1.36932830.36933563_tile_1 1000 - 36932824 36933023
1 36932998 36933235 CSF3R.exon.16.line.2.merged_with.CSF3R.exon.15.line.3.merged_with.CSF3R.exon.14.line.4.chr1.36932830.36933563_tile_2 1000 + 36933025 36933208
I'll run the command without the -a
and get back to you.
The bed file looks OK to me. Can you try without the -a
argument?
Further, VarDict assigns a locus to a single amplicon. Thus if you have overlapping amplicons, it tries to choose the optimal one and therefore you will only see a fraction of the depth. It shouldn't be that drastically different though. But let us know what you see without specifying -a
.
Is your problem present for most variants or just a handful? Freebayes wouldn't use information about amplicon locations so would happily count any primer sequences into the depth.
@Khillo81, hello!
I think I can close this problem because it is outdated. Please let us know if you need help with VarDict. Thank you!
Hi!
I'm interested in using your variant caller in my pipeline and have already done a test run on targeted re sequencing data using the amplicon-based Illumina TruSight Myeloid Panel. I've compared the results with those of FreeBayes and noticed something strange: for the same variant call at a certain position, FreeBayes reports a much higher depth (DP) as compared to VarDict (e.g. 6000 compared to 40). I've checked the per-base coverage report and FreeBayes is accurate. Also, the QUAL for all bases is in a very narrow range with ~95% of the variants falling between 30 and 37. These same variants have a high quality score with FreeBayes. Are there some filtering steps that are carried out by VarDict that cause me to lose some reads?
My command line call for VarDict:
vardict -G ref.fa -N sample_name -f 0.01 -b sample.bam -c 1 -S 2 -E 3 -g 4 -Q 20 -a -F 0 targets.bed | teststrandbias.R | var2vcf_valid.pl -N sample_name -E -f 0.01
I'm using BWA for mapping and then performing a base quality recalibration and local realignment using GATK before doing the variant calling. The sequencing was done on an Illumina NextSeq in paired-end mode 2x150 bp.
Thanks for your help.