Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
437 stars 150 forks source link

Alternative allele of matched variants is not reported in custom annotation with overlap method #1664

Open jperales opened 2 months ago

jperales commented 2 months ago

Hi, I noticed that a allele json field is reported for matched variants in custom annotation with the exact method. However it is not reported when the overlap method is used instead. Please see test case below.

As far as I understand, actually this allele field would be more useful in the latter case. The exact method implies to find variants that are exactly matching the coordiantes & the alternative allele. Hence you could assume the same alternative allele as the input variant. In contrast, the overlap method might find several overlapping variants among which one of them might be an exact match to the input. Then this allele field could potentially be used to distinguish an exact match from all overlapping variants, but it is not reported anymore in this mode.

Thus reporting the allele when using overlap method could be considered an interesting feature to include.

Test case

Input variant (VCF format)

1 113834946 . A AGCG . . .

Input custom annotation (GnomAD v4.0 chr1) for the input variant coordinates

There are 2 SNP and 1 indel in GnomAD v4.0. The second variant is an exact match of the input variant:

❯ bcftools view -H gnomad.exomes.v4.0.sites.chr1.vcf.bgz chr1:113834946-113834946 | cut -f 1,2,3,4,5,6,7
chr1    113834946       .       A       AGCG    .       AC0
chr1    113834946       rs2476601       A       G       .       PASS
chr1    113834946       rs2476601       A       T       .       AS_VQSR

Custom annotation with exact mode

❯ vep --no_stats -id "1 113834946 . A G . . ." -o exact.json --force_overwrite --json --assembly GRCh38 --symbol --numbe
rs --offline --custom gnomad.exomes.v4.0.sites.chr1.vcf.bgz,GnomADe,vcf,exact,0,AF

Json output: exact.json , relevant node below. Please note that allele is reported

     "custom_annotations": {
        "GnomADe": [
            {
                "fields": {
                    "FILTER": "PASS",
                    "AF": 0.911187
                },
                "allele": "G",
                "name": "rs2476601"
            }
        ]
    },

It finds an exact match among the 3 candidates. It reports the alternative allele of the matched variant, which is the same alternative allele as the input.

Custom annotation with overlap method

❯ vep --no_stats -id "1 113834946 . A G . . ." -o overlap.json --force_overwrite --json --assembly GRCh38 --symbol --numbers --offline --custom gnomad.exomes.v4.0.sites.chr1.vcf.bgz,GnomADe,vcf,overlap,0,AF

Json output: overlap.json, relevant node below. Please note that allele is NOT reported anymore

    "custom_annotations": {
        "GnomADe": [
            {
                "name": "chr1:113834947-113834947",
                "fields": {
                    "AF": 0,
                    "FILTER": "AC0"
                }
            },
            {
                "name": "rs2476601",
                "fields": {
                    "AF": 0.911187,
                    "FILTER": "PASS"
                }
            },
            {
                "name": "rs2476601",
                "fields": {
                    "AF": 7.0241e-7,
                    "FILTER": "AS_VQSR"
                }
            }
        ]
    },

It finds all 3 overlapping variants from the custom file. It does not report the alternative allele. So it is not possible to identify that one of the overlapping variant is actually an exact match.

System

Full error message

none

Data files (if applicable)

They include:

Thank you very much for your great work! Kind regards, Javier

likhitha-surapaneni commented 2 months ago

Hi @jperales ,

Thank you for using Ensembl VEP and for suggesting enhancements. Currently, allele-specific annotation is retrieved with exact annotation type. While there is an option to add additional fields to the custom annotation, it is currently only possible for fields in the INFO column (docs). I will discuss this further with the team and get back to you.

Thanks and regards, Likhitha