Currently alt alleles are only included if they are called in a genotype. This will cause issues with future use of the GP or GL FORMAT fields.
A better approach is to include alt alleles exceeding a specific quality score. These scores can be calculated across all sample calls or posterior distributions (?) and included as an INFO field of size A. The argument --alt-quality or similar can be used to record the combined evidence for each allele. This approach will limit the size of the GP and GL fields.
The GP field may still sum to less than 1 if this approach is used but the VCF standard doesn't appear to specify that the posterior distribution has to be complete, only that it is recorded for possible genotype given the called alleles.
The --call-filtered flag could then be removed. The issue --call-filtered currently is that calling alleles for samples with 0 read coverage results in alleles that are a random sample of the prior distribution. This makes a mess of the output VCF.
Currently alt alleles are only included if they are called in a genotype. This will cause issues with future use of the GP or GL FORMAT fields.
A better approach is to include alt alleles exceeding a specific quality score. These scores can be calculated across all sample calls or posterior distributions (?) and included as an INFO field of size A. The argument
--alt-quality
or similar can be used to record the combined evidence for each allele. This approach will limit the size of the GP and GL fields.The GP field may still sum to less than 1 if this approach is used but the VCF standard doesn't appear to specify that the posterior distribution has to be complete, only that it is recorded for possible genotype given the called alleles.
The
--call-filtered
flag could then be removed. The issue--call-filtered
currently is that calling alleles for samples with 0 read coverage results in alleles that are a random sample of the prior distribution. This makes a mess of the output VCF.