Closed jjfarrell closed 3 years ago
One of your header lines indicates it expects Number=3 field but the vcf entry has one value.
Thanks for the quick response! I traced the error to the PL field (Number=G) when writing out the genotypes.vcf.gz It is not triggered when writing out the variants.vcf.gz. The single value is "." is triggering the error. That is the value that bcftools specifies when merging samples when the genotype is missing. So I think the header is fine. It looks like the missing value '.' is not being handled correctly when writing out.
GT:FT:GQ:PL:PR:SR:IF 0/0:.:.:.:.:.:0.782063 0/0:.:.:.:.:.:0.779178 0/0:.:.:.:.:.:0.78203
r``` w-r--r-- 1 farrell casa 87K Mar 5 20:23 genotypes.vcf.gz -rw-r--r-- 1 farrell casa 9.2K Mar 5 20:23 grmpy.log -rw-r--r-- 1 farrell casa 8.5K Mar 5 20:23 genotypes.json.gz -rw-r--r-- 1 farrell casa 4.0K Mar 5 20:23 variants.json.gz -rw-r--r-- 1 farrell casa 113K Mar 5 20:23 variants.vcf.gz -rw-r--r-- 1 farrell casa 218 Mar 5 20:23 sample.txt
Also since I am trying to run paragraph on 5k crams with SVs compiled from various callers, it would be nice if there was an option so that the candidate vcf does not require the sample individual genotypes to run. The SV candidate vcf could then be distributed to other researchers to use for genotyping with paragraph.
If a candidate vcf is created without the sample genotypes, the error disappears. There is no GT='.' to copy over to the new vcf. One does not get the OLD GT and other info from the original genotyping stats in the new output with this candidate vcf with this change.
That's the default behavior. When the genotyped sample is not in the input vcf, Paragraph will add a new sample column with GT. When the genotyped sample is in the input vcf, Paragraph will output the genotype in GT field and move the original GT field to OLD_GT. Do you still have the missingGT error?
Hi @jjfarrell ,
Thanks for the tip. I also encountered the same error due to improper handling of "." or "Null" value in my input VCF sample genotype field. After excluding VCF samples from the update step, everything is back to normal.
Thanks, Wei
When running paragraph on a test vcf with just one variant row, this error is triggered. Any suggestions?