genome-nexus / genome-nexus-vep

Java spring boot wrapper around VEP
https://www.genomenexus.org
MIT License
3 stars 14 forks source link

Add an endpoint to genome-nexus-vep compatible with the Ensembl `/vep/human/hgvs/VARIANT` call #11

Open ruslan-forostianov opened 1 year ago

ruslan-forostianov commented 1 year ago

The genome-nexus-vep supports retrieving VEP annotations by a list of variants in ENSEMBL region format only. There is no endpoint to do it by a list of variants in hgvs format at the moment. Although the vep command that is actually used under the hood does support several formats including hgvs. See --format option here. And here is more information on how the hgvs file format should exactly look like.

The way how we pass the list of variants and in what format is the biggest differences between /vep/human/region of genome-nexus-vep and grch37.rest.ensembl.org/vep/human/hgvs/VARIANT of Ensembl API calls. See more details on differences in the comments to this ticket.

This slight difference requires genome-nexus to have 2 separate properties for each endpoint. One for the external Ensembl API call and another for the genome-nexus-vep endpoint.

It would be great to add an Ensembl API-compatible /vep/human/hgvs endpoint to genome-nexus-vep. By having the compatible calls we can decide whether to use the external API or local version by specifying vep.url property to point to the respective url.

Despite the comment gn_vep.region.url is not respected in the all places in the application. Some features use external calls to vep.url despite gn_vep.region.url is set to get the data.

Actually, after the proposed change there might be no need in gn_vep.region.url property anymore. Many of the code that is specific to the regional endpoint in genome-nexus and genome-nexus-vep can be just removed (RegionVariantAnnotationService, CachedVariantRegionAnnotationFetcher, VEPRegionDataFetcher, ...).

ruslan-forostianov commented 1 year ago

Below are the calls that produce primarily identical output:

curl \
 --request POST \
 --header "Content-Type: application/json" \
 --data '{"hgvs_notations": ["17:g.7578388C>G"]}' \
 'http://grch37.rest.ensembl.org/vep/human/hgvs/VARIANT?content-type=application/json&xref_refseq=1&ccds=1&canonical=1&domains=1&hgvs=1&numbers=1&protein=1' | jq --sort-keys > curl_external_results_sorted.json

gist of curl_external_results_sorted.json

curl \
 --request POST \
 --header "Content-Type: application/json" \
 --data '["17:7578388-7578388:1/G"]' \
 'http://localhost:8081/vep/human/region' | jq --sort-keys > curl_internal_results_sorted.json

NOTE: you have to have genome-nexus-vep configured and running locally to run the above command.

gist of curl_internal_results_sorted.json