AtlasOfLivingAustralia / galah-R

Query living atlases from R
https://galah.ala.org.au
38 stars 3 forks source link

Support `dplyr::collect()` for running `galah` queries #183

Closed mjwestgate closed 9 months ago

mjwestgate commented 1 year ago

{galah} 1.5.1 added dplyr verbs as alternatives to various functions e.g.

galah_call() |>
  filter(year == 2022) |>
  group_by(basisOfRecord) |>
  count()

The major one not yet included was collect(), which could substitute for atlas_ functions, if galah_call() gained a type argument or similar, e.g.

galah_call(type = "occurrences") |>
  filter(year) |> 
  select(group = c("basic", "events")) |> 
  collect()

This syntax is most closely analogous to the {gbifdb} package.

mjwestgate commented 1 year ago

Another valuable step here would be adding compute() prior to collect(), as this more accurately mirrors what is actually happening; i.e. compute() sets up a job to process on the selected atlas, and collect() retrieves it once complete.