Open cboettig opened 1 month ago
Fantastic project here, and congrats @leg2015 and team on the paper.
Just a note, GBIF monthly snapshots are available as partitioned parquet files from https://registry.opendata.aws/gbif/, which can be faster than hitting GBIF's own API.
e.g. in python
import ibis gbif = ibis.read_parquet("s3://gbif-open-data-us-east-1/occurrence/2024-10-01/occurrence.parquet/**")
Or in R
library(duckdbfs) gbif <- open_dataset("s3://gbif-open-data-us-east-1/occurrence/2024-10-01/occurrence.parquet/**")
Hi Carl, that's great to hear GBIF has AWS and parquet integration now! That will definitely be worth building into the data generation pipeline 🙂
Fantastic project here, and congrats @leg2015 and team on the paper.
Just a note, GBIF monthly snapshots are available as partitioned parquet files from https://registry.opendata.aws/gbif/, which can be faster than hitting GBIF's own API.
e.g. in python
Or in R