dkoboldt / varscan

Variant calling and somatic mutation/CNV detection for next-generation sequencing data
154 stars 34 forks source link

Missing filters in VarScan fpfilter output VCF header #48

Open Stikus opened 5 years ago

Stikus commented 5 years ago

Hello, there are VCF header lines from FpFilter.java file from VarScan.v2.4.4.source.jar - looks like this is fpfilter source.

String vcfHeaderInfo = "";
vcfHeaderInfo = "##FILTER=<ID=VarCount,Description=\"Fewer than " + minVarCount + " variant-supporting reads\">";
vcfHeaderInfo += "\n##FILTER=<ID=VarFreq,Description=\"Variant allele frequency below " + minVarFreq + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=VarReadPos,Description=\"Relative average read position < " + minVarReadPos + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=VarDist3,Description=\"Average distance to effective 3' end < " + minVarDist3 + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=VarMMQS,Description=\"Average mismatch quality sum for variant reads > " + maxVarMMQS + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=VarMapQual,Description=\"Average mapping quality of variant reads < " + minVarMapQual + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=VarBaseQual,Description=\"Average base quality of variant reads < " + minVarBaseQual + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=Strand,Description=\"Strand representation of variant reads < " + minStrandedness + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=RefMapQual,Description=\"Average mapping quality of reference reads < " + minRefMapQual + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=RefBaseQual,Description=\"Average base quality of reference reads < " + minRefBaseQual + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=MMQSdiff,Description=\"Mismatch quality sum difference (ref - var) > " + maxMMQSdiff + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=MapQualDiff,Description=\"Mapping quality difference (ref - var) > " + maxMapQualDiff + "\">";
vcfHeaderInfo += "\n##FILTER=<ID=ReadLenDiff,Description=\"Average supporting read length difference (ref - var) > " + maxReadLenDiff + "\">";

But there are actual filtering lines:

failReason += "RefReadPos";
failReason += "RefDist3";
failReason += "RefMapQual";
failReason += "RefMMQS";
failReason += "RefAvgRL";
failReason += "SomaticP";
failReason += "VarCount";
failReason += "VarFreq";
failReason += "VarReadPos";
failReason += "VarDist3";
failReason += "VarMMQS";
failReason += "VarMapQual";
failReason += "RefBaseQual";
failReason += "VarBaseQual";
failReason += "VarAvgRL";
failReason += "Strand";
failReason += "MaxBAQdiff";
failReason += "MMQSdiff";
failReason += "MinMMQSdiff";
failReason += "MapQualDiff";
failReason += "ReadLenDiff";
failReason = "NoReadCounts";

Looks like RefReadPos, RefDist3, RefMMQS, RefAvgRL, SomaticP, VarAvgRL, MaxBAQdiff, MinMMQSdiff and NoReadCounts are missing from header. Some VCF-processing programs want all filter to be listed in VCF-header. And using commas as separators as described here is a problem too - better to change them to semicolon.

myourshaw commented 2 years ago

This is still a problem. Picard MergeVCFs will fail. A workaround can be to use bcftools reheader to add contigs followed by Picard FixVcfHeader to add dummy FILTER and INFO headers. Not pretty, but it works. `

create vcf.idx

docker run -v $(pwd):/sandbox myourshaw/gatk:4.2.4.1 gatk IndexFeatureFile \ -I /sandbox/MG20-1976@08182020JH_ST.b37.map.dedup.sample.varscan2.fpfilter.pass.vcf \ ;

add contigs

docker run -v $(pwd):/sandbox myourshaw/tools:latest bcftools reheader \ --fai /sandbox/b37/references/human_g1k_v37_decoy_GOAL+viral.fasta.fai \ --output /sandbox/MG20-1976@08182020JH_ST.b37.map.dedup.sample.varscan2.fpfilter.pass.vcf.reheader.vcf \ /sandbox/MG20-1976@08182020JH_ST.b37.map.dedup.sample.varscan2.fpfilter.pass.vcf \ ;

fix header by adding all possible FILTERs

docker run -v $(pwd):/sandbox myourshaw/picard:2.25.7 java -Xmx20g -jar /usr/picard/picard.jar \ FixVcfHeader \ -I /sandbox/MG20-1976@08182020JH_ST.b37.map.dedup.sample.varscan2.fpfilter.pass.vcf.reheader.vcf \ -O /sandbox/MG20-1976@08182020JH_ST.b37.map.dedup.sample.varscan2.fpfilter.pass.reheader.fixed.vcf \ ; `