Code repository for 2024 Data Science for the Common Good project with iNaturalist.
Collaborators: Angela Zhu, Paula Navarrete, Sergei Pogorelov, Ozel Yilmazel
This code enables the recreation of the results from our ICML 2023 paper Spatial Implicit Neural Representations for Global-Scale Species Mapping.
Estimating the geographical range of a species from sparse observations is a challenging and important geospatial prediction problem. Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location. In this work, we use Spatial Implicit Neural Representations (SINRs) to jointly estimate the geographical range of thousands of species simultaneously. SINRs scale gracefully, making better predictions as we increase the number of training species and the amount of training data per species. We introduce four new range estimation and spatial representation learning benchmarks, and we use them to demonstrate that noisy and biased crowdsourced data can be combined with implicit neural representations to approximate expert-developed range maps for many species.
Clone the repository git clone git@github.com:UMassCDS/ds4cg2024-inaturalist.git
Navigate to cloned project's root, run git submodule init
and then git submodule update --remote --merge
You are all set for setting up the code.
Note: to update submodules with latest changes, run git submodule update --remote --merge
Note: src/backend/sinr
is a submodule from UMassCDS/inatrualist-sinr
, if you need to work on sinr
code, follow development practices for that repository, this includes making a dedicated development branch, making PRs.
Note: If you need to switch to a particular UMassCDS/inaturalist-sinr
branch and run the prototype, navigate to src/backend/sinr
, use git checkout <branch-name>
to switch to a branch. Now the submodule will be at a different branch.
Make sure you always update your local branch to the latest.
If you want to run the app locally for development purposes, download the pretrained models from here, unzip them, and place them in a folder located at
/src/backend/sinr/pretrained_models
.
If you only run the app in Docker, there's no need to download the models; Docker will handle this for you inside the image.
We recommend using an isolated Python environment to avoid dependency issues. Install the Anaconda Python 3.9 distribution for your operating system from here.
Create a new environment and activate it:
conda create -y --name inatator python==3.9
conda activate inatator
After activating the environment, install the required packages:
pip install -r src/backend/requirements.txt && pip install -r src/backend/requirements-dev.txt
install js libraries needed for react
npm i --prefix src/frontend/
ds4cg2024-inaturalist
directory if you are not already there:uvicorn src.backend.app.main:app --reload
ds4cg2024-inaturalist
directory if you are not already there:npm start --prefix src/frontend/
In your web browser, open the link http://localhost:3000/
Note: you don't have to initialize submodule to run docker, dockerfile will set up the submodules for you while building the image.
docker compose up --build
, for the first build it may take a while, after build the application will be ran, you can access the application through the localhost:3000
.docker compose up
docker compose build
docker compose up --build
Note: If you want to build only one service, use docker compose build <service-name>
, for example for backend it will be docker compose build backend
.
requirements-dev.txt
.This project was enabled by data from the Cornell Lab of Ornithology, The International Union for the Conservation of Nature, iNaturalist, NASA, USGS, JAXA, CIESIN, and UC Merced. We are especially indebted to the iNaturalist and eBird communities for their data collection efforts. We also thank Matt Stimas-Mackey and Sam Heinrich for their help with data curation. This project was funded by the Climate Change AI Innovation Grants program, hosted by Climate Change AI with the support of the Quadrature Climate Foundation, Schmidt Futures, and the Canada Hub of Future Earth. This work was also supported by the Caltech Resnick Sustainability Institute and an NSF Graduate Research Fellowship (grant number DGE1745301).
If you find our work useful in your research please consider citing our paper.
@inproceedings{SINR_icml23,
title = {{Spatial Implicit Neural Representations for Global-Scale Species Mapping}},
author = {Cole, Elijah and Van Horn, Grant and Lange, Christian and Shepard, Alexander and Leary, Patrick and Perona, Pietro and Loarie, Scott and Mac Aodha, Oisin},
booktitle = {ICML},
year = {2023}
}
Extreme care should be taken before making any decisions based on the outputs of models presented here. Our goal in this work is to demonstrate the promise of large-scale representation learning for species range estimation, not to provide definitive range maps. Our models are trained on biased data and have not been calibrated or validated beyond the experiments illustrated in the paper.