This issue is related to (https://github.com/huguesrichard/Allopipe/issues/44) and refers to the special case where all alt alleles of a multiallelic site have the same gnomad AF (should also work when gnomad AF is only reported for one of the multiple alt alleles - not tested). Such variants will pass the restriction of only having a single AF value present and thus will get scored.
Currently VEP CSQ info parsing strategy will aggregate all alt amino acids irrespective of an individuals genotype. In case of a multiallelic sites different amino acids may be reported for each of the alt alleles. For better illustration I include an example for chr1 11828561 chr1_11828561_G_C;chr1_11828561_G_A G C,A. The following INFO field contains CSQ annotation for both alt alleles
An individual with genotype 0/1 would be carrying both the G and the C allele --> AAs: C and S but not Y. However, when this individual is scored against a 0/0 GT the AMS is calculated to be 2. In the mismatch table (has been transposed for readability) we can see that the individual 0/1 is thought to also have the Y AA which however would only result out of the G>A substitution.
This issue is related to (https://github.com/huguesrichard/Allopipe/issues/44) and refers to the special case where all alt alleles of a multiallelic site have the same gnomad AF (should also work when gnomad AF is only reported for one of the multiple alt alleles - not tested). Such variants will pass the restriction of only having a single AF value present and thus will get scored.
Currently VEP CSQ info parsing strategy will aggregate all alt amino acids irrespective of an individuals genotype. In case of a multiallelic sites different amino acids may be reported for each of the alt alleles. For better illustration I include an example for
chr1 11828561 chr1_11828561_G_C;chr1_11828561_G_A G C,A
. The following INFO field contains CSQ annotation for both alt allelesTo make this more readable I restrict the info to relevant fields. The following can be reproduced with the commands provided:
An individual with genotype 0/1 would be carrying both the G and the C allele --> AAs: C and S but not Y. However, when this individual is scored against a 0/0 GT the AMS is calculated to be 2. In the mismatch table (has been transposed for readability) we can see that the individual 0/1 is thought to also have the Y AA which however would only result out of the G>A substitution.
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
CHROM | 1 -- | -- POS | 11828561 ID_x | chr1_11828561_G_C;chr1_11828561_G_A REF | G ALT | C,A QUAL_x | 49 FILTER_x | . FORMAT_x | ['GT', 'DP', 'AD', 'GQ', 'PL', 'RNC'] GT_x | 0/0 GQ_x | 48 AD_x | 16,0,0 phased_x | 0/0 DP_x | 16 TYPE_x | homozygous missense_variant_x | 8 transcripts_x | ENST00000312413,ENST00000346436,ENST00000376490,ENST00000376491,ENST00000376492,ENST00000376496,ENST00000400892,ENST00000494028 genes_x | ENSG00000011021 aa_REF | C aa_ALT | S,Y gnomADe_AF_x | 3.98E-06 aa_ref_indiv_x | C aa_alt_indiv_x | aa_indiv_x | C ID_y | chr1_11828561_G_C;chr1_11828561_G_A QUAL_y | 49 FILTER_y | . FORMAT_y | ['GT', 'DP', 'AD', 'GQ', 'PL', 'RNC'] GT_y | 0/1 GQ_y | 43 AD_y | 28,39,0 phased_y | 0/1 DP_y | 67 TYPE_y | heterozygous missense_variant_y | 8 transcripts_y | ENST00000312413,ENST00000346436,ENST00000376490,ENST00000376491,ENST00000376492,ENST00000376496,ENST00000400892,ENST00000494028 genes_y | ENSG00000011021 gnomADe_AF_y | 3.98E-06 aa_ref_indiv_y | C aa_alt_indiv_y | S,Y aa_indiv_y | C,S,Y diff | S,Y mismatch | 2 mismatch_type | heterozygous