Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
437 stars 150 forks source link

Allow passing a github token to avoid rate limit errors #1687

Open dvg-p4 opened 1 month ago

dvg-p4 commented 1 month ago

Describe the issue

As mentioned in https://github.com/Ensembl/ensembl-vep/issues/566, INSTALL.pl's update (and check-for-update) procedures use the GitHub REST API, and thus are susceptible to fail if you exceed the rate limit. Currently, the calls to the GH API are unauthenticated, and thus subject to the relatively low 60 requests/hour/IP limit. This is particularly a problem in systems such as ours where (as far as I can tell) the corporate network causes queries from thousands of users to appear as if coming from the same IP, leaving the GH API perpetually timed-out.

I suggest that VEP's INSTALL.pl script should allow passing a github personal access token, and/or read one from the GITHUB_TOKEN envar. This increases the rate limit to 5,000 requests per hour, and makes it per user rather than per IP, allowing virtually unlimited access in all reasonable scenarios.

To do this manually, I've found the following workaround: Edit INSTALL.pl, changing this line: https://github.com/Ensembl/ensembl-vep/blob/31a3581b84495b617b2f3980da6c6313ca6d238f/INSTALL.pl#L1806 to

     my $response = `curl -s -o $file -w '%{http_code}' --header "Authorization: Bearer <your github PAT>" --location "$url" `; 

Note that you should only use this workaround with a read-only PAT, since it does incidentally send your PAT in the header to every other site that this script downloads from.

I presume doing the same but with an actual variable instead of a hardcoded PAT, and only when the particular download is actually from github and a token is provided, would solve this issue more robustly.

Additional information

System

Full VEP command line

./INSTALL.pl

Full error message

curl failed (403), trying to fetch using LWP::Simple LWP::Simple failed (403), trying to fetch using HTTP::Tiny ERROR: Failed last resort of using HTTP::Tiny to download https://api.github.com/repos/Ensembl/ensembl-vep

likhitha-surapaneni commented 3 weeks ago

Hi @dvg-p4 , thank you for letting us know about this. We will try to look into this in the upcoming releases.