AstraZeneca-NGS / VarDictJava

VarDict Java port
MIT License
128 stars 57 forks source link

Problem upgrading from Perl Vardict to Java Vardict #348

Open Graeme-Smith opened 2 years ago

Graeme-Smith commented 2 years ago

Hi.
I'm trying to upgrade a bioinformatics pipeline that uses a really old Perl version of Vardict from 2017. (This predates versioned releases of the tool, we're using the code at commit 328e00a). I'd like to upgrade to the latest Java Vardict version v1.8.3. My problem is that when I compare the two versions using a standard benchmark (Horizen HD200 FFPE control) I miss several variants which I am able to detect with the older tool. I think I'm probably failing to set a command line parameter required by the newer version, any pointers to the likely problem would be appreciated:

I'm running the identical commands.

# Command run for perl version:
git clone https://github.com/AstraZeneca-NGS/VarDict.git
git reset --hard 328e00a116 #revert all files to the production version currently in use.
perl_328e00a116/vardict -G genome.fa -f 0.01 -b ONCAA_02_EK6037_SWIFT57_Pan2684_S2_R1_001.refined.primerclipped.bam -c 1 -S 2 -E 3 -g 4 -q 22.5 -Q 10  Pan4081_flat.bed | tee perl_328e00a116_h200_results.txt | teststrandbias.R | var2vcf_valid.pl > perl_328e00a116_h200_results.vcf

# Command run for java versions:    
java_v1.8.2/VarDictJava/build/install/VarDict/bin/VarDict -th -G genome.fa -b ONCAA_02_EK6037_SWIFT57_Pan2684_S2_R1_001.refined.primerclipped.bam -f 0.01 -c 1 -S 2 -E 3 -g 4 -q 22.5 -Q 10 Pan4081_flat.bed | tee java_v1.8.2_results.txt | teststrandbias.R | var2vcf_valid.pl java_v1.8.2_results.vcf
PolinaBevad commented 2 years ago

Hi @Graeme-Smith ,

We changed some realignment methods during these years, improved merging of SNV and complex variants and changed filtering of reads, so some changes are expected. I see that most of the variants you show are SNVs - didn't they combine in some complex variants now? Can you recheck this region in IGV?