Open simon-tarr opened 3 years ago
Thanks for the issue @simon-tarr !
That mention of paging in the paper is for the R client, not the Python client.
Right now you have to do the pagination yourself, using limit and offset params
automated pagination has been discussed in #63 - it will take some work, not sure when it will be done. faster of course if someone sends a PR
Thanks for the reply @sckott. Totally missed #63, sorry about that. A friend and I will look into this - if we can develop something sensible will raise a PR for you!
Hi @sckott - what's the best way of finding out the total number of records using occ.search()
(if it's even possible within that method or pygbif
as a whole)? With that information, sounds like it would be reasonably straightforward to write a loop.
e.g. if there are 650 records, records 1 - 300 are in the first iteration, set offset to 301, grab results 301 - 600, set iteration to 601, grab results 601 - 650. Sound reasonable?
from pygbif import occurrences as occ
x = occ.search(taxonKey = 3329049)
x['count']
Or with the count API route
occ.count(taxonKey = 3329049)
Note that the search and count methods are for different API routes, with potentially different behavior https://www.gbif.org/developer/occurrence - GBIF said they will eventually remove the count API route
I'm sending what I think is a straightforward call to GBIF using:
However, no matter what I set the limit to, I never get more than 300 results back. According to the package's paper,
pygbif
uses internal paging to return more than 300 results (if specified using the limit argument) up to GBIF's limit of 200,000 records.I have tried passing very large polygons over big stretches of the UK and Europe and never get more than 300 results.
Am I doing something wrong?
I'm using Python v3.8.6 and the latest version of pygbif.