gbif / gbif-api

GBIF API
Apache License 2.0
27 stars 5 forks source link

GBIF API requests: any rate limits? #128

Closed abubelinha closed 2 months ago

abubelinha commented 2 months ago

Should API users limit somehow the number of requests/minute against gbif API? I couldn't find any suggested limits in API docs.

The only comment that I know about this was "there are no limits": https://lists.gbif.org/pipermail/api-users/2017-December/000495.html

But when using Python requests (did not try pygbif yet) I am hitting some kind of ... limit?

Should I check for presence of some json key informative about that? (maybe requests before my error contained a warning which I am missing)

Basically, I am trying to fuzzy-match against backbone a sorted(A-Z) list of ~3000 names (very much the same as the name matching tool does). But my script is always stopping at some point. Not always the same point (tried 4 runs but my loop is never reaching the end, despite in 4th run my requests were already limitted to one per second):

  1. urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.gbif.org', port=443): Max retries exceeded with url: /v1/species/match?name=Polypogon%20maritimus%20maritimus&kingdom=Plantae
  2. requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.gbif.org', port=443): Max retries exceeded with url: /v1/species/match?name=Galium%20aparine%20spurium&kingdom=Plantae
  3. requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.gbif.org', port=443): Max retries exceeded with url: /v1/species/match?name=&kingdom=Plantae&usageKey=2965434
  4. requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.gbif.org', port=443): Max retries exceeded with url: /v1/species/match?name=Filago%20pyramidata&kingdom=Plantae

This is most probably due to some temporal "system health network issue" at GBIF's side, but my question still stands. No documented usage limits?

Thanks in advance @abubelinha

MattBlissett commented 2 months ago

There are no request limits, except for the download API.

There is currently a problem with out network provider's firewall, and there are interruptions lasting from about 1 to 300 seconds where all requests are failing. Today (since midnight) there have been seven one-second outages, but nothing longer than that.

Unfortunately, we don't know when this will be fixed — it's not something we can control.

I recommend you run your script without any added delays. You could also increase the max-retries value.

MattBlissett commented 2 months ago

If you think there have been longer outages today, it would be useful to know.

abubelinha commented 2 months ago

... there are interruptions lasting from about 1 to 300 seconds where all requests are failing. ... I recommend you run your script without any added delays. You could also increase the max-retries value.

So max_retries=15 plus timeout=30 should solve this?

abubelinha commented 2 months ago

Yes, looks like max_retries & timeout helped. Thanks @MattBlissett Do you know whether pygbif already includes parameters to control that too?