AtlantaBotanicalGarden / gap-analysis-shiny-app

3 stars 0 forks source link

Large GBIF query #7

Open dcarver1 opened 7 months ago

dcarver1 commented 7 months ago

The current rgbif::get_occ function has a limiter on requests to 500. The function itself is limtied to 300 per request the the 500 mark is already making two request.

This can be bumped up to ~100000 or 300+ requests.

If we are going to cap the total number of points to

  1. make a request for
  2. show on the map
  3. perform a gap analysis on We should give users some means of selecting from the subset.

My though here are

  1. Give a max download option
  2. Add a "Wait" screen to the download GBIF button
  3. Automatically prioritize any germplasm (G) points from that full call and return a randomly sampled subset.

We can handle all that with spatial operations but the map display and specifically the gap analysis are going to start eating into resources. Some experimentation will be needed for evaluating these processes

Jonathan-Gore commented 7 months ago

Sounds like part of what determines how much data should be "allowed" to be queried from gbif is partly reliant on how much data a gap analysis can be run on shinyapps.io

Maybe we should try and get a crude gap-analysis up-and-running and start stress testing how much data is too much? Once we have a handle on the limitations of the app we could start imposing those restrictions/checks on the data acquisition side of things like querying GBIF.

Jonathan-Gore commented 7 months ago

Create a two-tiered occurrence API call system out-of-500. Tier 1: "Living_Specimen", Tier 2: tbd (currently everything). Fills out what is left from the original 500