Code for Resolving Label Uncertainty with Implicit Posterior Models.
This directory details and replicates the experimental steps for the land cover experiments.
Steps for setting up the python environment, downloading data, and running the processing and experiments for land cover mapping are detailed below. Depending on where you store the datasets, you may need to reset some of the paths in the config files in qr_for_landcover/conf
(and same thing for some of the evaluation scripts and notebooks).
conda env create -f environment.yml
conda activate qr_torchgeo
python -m pip install git+https://github.com/microsoft/torchgeo
.The default parameters in this repo will assume you have data stored in /torchgeo_data
. To download the datasets, you can follow these steps:
Chesapeake:
EnviroAtlas:
Important: If you want to skip constructing the priors and move ahead to the experiment scripts, you can download the precomputed priors from torchgeo using the steps in the previous section. You only need to follow these steps if you explicitly want to recreate the priors from the original data.
To construct the priors for the Chesapeake dataset, first make sure you have the original dataset downloaded via torchgeo. Then, from `qr_for_landcover/compute_priors' run:
compute_cooccurrence_matrices_chesapeake.py
to compute the class cooccurrence matrices from the training sets in each state, and then make_priors_chesapeake.py
to make the priors and save them in the folder from torchgeo.
Note that you'll need to change the paths to the data directories at the top of each script. The notebooks in the qr_for_landcover/compute_priors
will visualize these outputs. To construct the priors for the EnviroAtlas dataset, theres a few additional steps to download the additional from the original data sources. The quick way is to download the data in the zip file from torchgeo (see above).
compute_cooccurrence_matrices_envirotlas.py
is the script to generate them from the full EnviroAtlas data (which you'd have to download separately). make_priors_envirotlas.py
makes the priors and saves them in the folder from torchgeo.
Note that you'll need to change the paths to the data directories at the top of each script. The notebooks in the qr_for_landcover/compute_priors
will visualize these outputs. To generate the learned EnviroAtlas priors from the inputs to the hand-coded prior:
learn_the_prior_enviroatlas.py
from the experiment_scripts
folder. evaluation
folder run save_learned_priors.py
evaluation/visualize_output/visualize_learned_priors_ea.ipynb
.The experiment scripts are broken up into hyperparameter search scripts (hp*.py) and evaluation runs (run*.py). To just replicate results in the paper, you can skip the hyperparameter searches. Evaluation of the results is described in the next section.
Chesapeake:
hp_gridsearch_de.py
, then run_qr_in_chesapeake_north.py.
EnviroAtlas):
hp_gridsearch_pittsburgh.py
and hp_gridsearch_pittsburgh_with_prior_as_input.py
hp_gridsearch_qr_from_scratch_pittsburgh.py
to pick parameters in Pittsburgh and run_qr_forward_enviroatlas_from_sctrach.py
to run the model in the test set in each city. run_qr_forward_enviroatlas_from_checkpoint.py
to run the model in the test set in each city. run_qr_forward_enviroatlas_learned_prior_from_checkpoint.py
.To evaluate the Chesapeake Conservancy predictions in NY and PA:
evaluation/evaluate_qr_models_chesapeake.ipynb
To evaluate the EnviroAtlas predictions in each state:
evaluation/evaluate_models_enviroatlas.ipynb
To save model output as tifs (e.g. for easy visualization):
save_predictions_chesapeake.py
or save_predictions_enviroatlas.py
from the evaluation
folder. If you only want to evaluate some enviroatlas experiments, you'll have to comment out some lines in that script.To visualize the outputs, use the notebooks in evaluation/visualize_output.
You'll need to adjust some of the directories defined at the top of the notebook to point to where your data and output are stored.
Notebooks to generate the figures in the paper are in the figure_notebooks
folder. You'll need to adjust some of the directories defined at the top of the notebook to point to where your data and output are stored.