Implement variant consequence features from VEP into L2G

As a developer, I want to calculate functional consequence scores using VEP annotations for each variant in a credible set because these scores will inform how damaging the variants are, which is important for prioritising genes.

Background

We have functional consequence data from VEP in the variant index object. The goal for L2G is to assess the functional impact of these variants and use these score for prioritising genes from GWAS loci.

We have assigned scores to these consequences to quantify how damaging each variant may be, and now need to implement new features to incorporate these scores into our set of functional genomics features. As described in https://github.com/opentargets/issues/issues/3552, we only want to take into consideration protein coding genes when calculating the neighbourhood features.

An accurate definition of the features is available in the features specifications document, and has been agreed with @addramir.

Tasks

[X] Write a method in the VariantIndex class that extracts VEP information
[x] New features to add: vepMaximum: Max VEP score per gene across all variants in given credible set vepMaximumNeighbourhood: Max VEP score across all variants relative to the mean VEP score across all genes in the vicinity vepMean: Mean VEP score per gene weighted by posterior probabilities across credible set vepMeanNeighbourhood: Mean VEP score across all variants relative to the mean VEP score across all genes in the vicinity

opentargets / issues

Implement variant consequence features from VEP into L2G #3554

Background

Tasks