ropensci / rentrez

talk with NCBI entrez using R
https://docs.ropensci.org/rentrez
Other
195 stars 38 forks source link

Work out how to handle changes to rate limiting #117

Closed dwinter closed 6 years ago

dwinter commented 6 years ago

As per #114, users will be allowed to make more than 3 requests per second if they are using an API key. The limit for users with a key will be 10 requests per second, but it will be possible for some users to get even faster connections.

We should check with NCBI about how to handle client-side limiting, but here is a proposal.

dwinter commented 6 years ago

NCBI are OK with this plan :white_check_mark:

sckott commented 6 years ago

@dwinter you aware of any rate limiting information returned in the headers or body of entrez API requests? I haven't seen any. If there isn't any, that really sucks

dwinter commented 6 years ago

Hi @sckott, unfortunately, there is no info in the headers, and I don't think there is any plan to include it. I gather the requests will just return an error if the user is sending too many too quick.y

At present rentez just Sys.sleeps for 1/3 of a second on every request. In the feature branch for this that changes to 1/10th of a second if ENTREZ_KEY is given is set as an envrioment variable.

Maybe not a great solution (and probably slower than it could be if rate-limiting could be taken from the headers) but seems like this simplest way to handle this?

sckott commented 6 years ago

Bummer. I've already emailed them, hopefully will lead to something eventually. Right, i think that's (sleeping) what we do when using entrez stuff in other pkgs.

boopsboops commented 5 years ago

Apologies, if this isn't the best place to discuss this, but I'm having some problems in dealing with these recent changes to NCBI's API. Running entrez_search or entrez_fetchas a single process was glacially slow, so I used mcmapply to distribute searches over cores. This worked very well until the situation now, where all requests are rejected due to the API rate limit. Even with an API key, I am rejected (even when the number of cores is reduced).

I appreciate that this isn't an issue with rentrez, but wondering if there are any tricks you are aware of to access NCBI data in a more controlled manner?

Cheers

dwinter commented 5 years ago

Hi @boopsboops,

I think the only option is to email the NCBI support desk and explain you use-case and how the current rules precent you from achieving resonable research goals. I understand they are able to specify custom rates for specific API keys.

boopsboops commented 5 years ago

Thanks @dwinter I'll try that.