Open bioinfo-dirty-jobs opened 6 years ago
Thanks for all these reports and apologies about the issue. I'm moving all the discussion to this thread to keep it in one place since these all revolve around an issue with retrieving the reference allele for a variant call. I'm not totally sure what is going on, but it appears as if we're generating an ensemble allele that we can't find back in one of the references. Would you be able to share the problem files (at least in the region chr1:13416) so I could try to replicate and debug what is happening?
Thanks much for the help debugging.
example_chr1.tar.gz here you have the files..thanks so much for your help
Thanks for passing along the files. Unfortunately I'm not able to replicate the errors you're seeing so I'll detail what I did to determine how it's different than how you're running. I had to cleanup varscan and somaticsniper output:
# manually edit varscan input file to add CHROM line (missing) and change AD field to String
# since FORMAT not correct
bcftools annotate -x FORMAT/DP4 somaticniper.vt.clean_small.vcf -O z -o somaticniper.vt.clean_small.nodp4.vcf.gz
Then I ran ensemble calling which finished sucessfully:
bcbio-variation-recall ensemble -n 1 --names varscan,mutect2,somaticsniper out.vcf.gz /human/hg19/seq/hg19.fa.gz varscan.somatic.cleand.vcf mutect2.vt.vcf somaticniper.vt.clean_small.nodp4.vcf.gz
Do you have additional steps which might be causing issues that I don't have in this processing. Thanks again for the help debugging.
thanks so much... I realize I use the vcf annotate using snpeff.
If I delete the DP4 and I use only varscan and somaticniper I have this error: