There is an issue with this conversion.
-Right now genotype fields are added to the table during the GATK step. If there are variants from >1 caller then the VCF is a multi-sample one and the number of fields is different, which makes the parsing downstream incorrect.
-Ideally we would use the attached script in the 'combine_variants' task to calculate a normaled VAF for all SNVs/indels and return a single sample VCF (one set of genotype fields).
-I took this out because it wouldnt work with a cram file for some reason, even though pysam should work with crams.
There is an issue with this conversion. -Right now genotype fields are added to the table during the GATK step. If there are variants from >1 caller then the VCF is a multi-sample one and the number of fields is different, which makes the parsing downstream incorrect. -Ideally we would use the attached script in the 'combine_variants' task to calculate a normaled VAF for all SNVs/indels and return a single sample VCF (one set of genotype fields). -I took this out because it wouldnt work with a cram file for some reason, even though pysam should work with crams.
We should fix this.
addReadCountsToVcfCRAM.py.zip