Messis
is a crop classification model for the agricultural landscapes of Switzerland. It is built upon the geospatial foundation model Prithvi, which was originally pre-trained on U.S. satellite data. Messis has been trained using our ZueriCrop 2.0 dataset, a collection of Sentinel-2 imagery combined with ground-truth crop labels that covers agricultural regions in Switzerland.
The Messis model leverages a three-tier hierarchical label structure, optimized for remote sensing tasks, to enhance its classification accuracy across different crop types. By adapting Prithvi to the specific challenges of Swiss agriculture—such as smaller field sizes and higher image resolutions by the Sentinel-2 satellites—Messis demonstrates the versatility of pretrained geospatial models in handling new downstream tasks.
Additionally, Messis reduces the need for extensive labeled data by effectively utilizing Prithvi's pretrained weights. In evaluations, Messis achieved a notable F1 score of 34.8% across 48 crop classes.
The repository is structured as follows, with the most important files and directories highlighted:
└── 📁messis
└── README.md
└── pyproject.toml [ℹ️ Poetry configuration file]
└── params.yaml [ℹ️ DVC configuration file]
└── model_training.ipynb [ℹ️ Jupyter notebook for training the model]
└── server-messis-lightning.sh [ℹ️ Script for training the model on a server with GPU]
└── slurm-messis-lightning.sh [ℹ️ SLURM script for training the model on a cluster]
└── .env.example [ℹ️ Example environment file]
└── 📁assets [ℹ️ Assets created for our report]
└── 📁data [ℹ️ The directory DVC uses to store data]
└── 📁messis [ℹ️ Full implementation of the Messis model]
└── 📁prithvi [ℹ️ Code for the Prithvi model, adapted from https://github.com/NASA-IMPACT/hls-foundation-os/]
└── 📁notebooks [ℹ️ Various notebooks for exploration, experimentation and evaluation]
Experience the Messis model firsthand by trying it out in our interactive Hugging Face Spaces Demo.
To learn how to load the model and perform inference, check the source code in our Huggingface Space.
Install poetry and the dependencies:
poetry install
Note if you're using Windows: You need to reinstall torch and torchvision with CUDA support. Change cu121
to your CUDA version and check whether the versions of torch and torchvision match with the ones in the pyproject.toml
file. For more details see: https://github.com/python-poetry/poetry/issues/6409
poetry shell
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cu121 -U
pip install torchvision==0.18.0 --index-url https://download.pytorch.org/whl/cu121 -U
Make sure you set VSCode Setting python.venvPath
to your poetry venv path, so that you can select the virtual environment in VSCode.
To enter the virtual environment:
poetry shell
To install new packages, use the following command, but make sure you have exited the shell with exit
before:
poetry add <package>
Setup DVC:
Initialize DVC:
dvc remote modify --local ssh password request-the-password-from-the-team
Pull the data:
dvc pull
Only set up this environment if you want to run Prithvi with the MMCV/MMSegmentation framework (see prithvi
folder).
This environment is as described in hls-foundation-os
:
conda create --name hls-foundation-os python==3.9
conda activate hls-foundation-os
pip install torch==1.11.0 torchvision==0.12.0
pip install -e .
pip install -U openmim
mim install mmcv-full==1.6.2
Next, download the Pritvhi model using the download_prithvi_100M.ipynb
notebook in the prithvi/model
directory.
Clone Repo
srun --partition performance git clone https://github.com/Satellite-Based-Crop-Classification/messis.git
Install Poetry
srun --partition performance curl -sSL https://install.python-poetry.org | python3 -
Open the .bashrc
file in a text editor, such as nano or vim. For example:
nano ~/.bashrc
Add the following line at the end of the file:
export PATH="/home2/YOUR_USER/.local/bin:$PATH"
To make the changes effective immediately in your current session, source the .bashrc
file:
source ~/.bashrc
Install the dependencies
srun --partition performance poetry install
Enter the virtual environment
poetry shell
Configure DVC
dvc remote modify --local ssh password request-the-password-from-the-team
Pull the data
srun --partition performance dvc pull
Log in to W&B
wandb login <API_KEY>
Log into Hugging Face
huggingface-cli login
Configure git user
git config --global user.name "Name Surname"
git config --global user.email "your.mail@example.com"
Resources for optimizing the training:
sh server-messis-lightning.sh
When you want to stop the job, you can kill the entire process group:
Find the process group ID (PGID):
ps -o pgid,cmd -p $(pgrep -f 'server-messis-lightning.sh')
Kill the process group:
kill -TERM -<PGID>
Replace <PGID>
with the actual process group ID you found.
Slurm Commands:
sbatch slurm-messis-lightning.sh
squeue
scontrol show nodes
scancel job_id
Make sure the config in messis-lightning.sh
and model_training.ipynb
are correctly set up and have the same values.
See the parameters --nodes, --gres and -ntasks-per-node (--gres and --ntasks-per-node must match) in the SLURM script:
#!/bin/sh
#SBATCH --time=08:00:00
#SBATCH --nodes=1 # This needs to match Trainer(num_nodes=...)
#SBATCH --gres=gpu:2
#SBATCH --ntasks-per-node=2 # This needs to match Trainer(devices=...)
#SBATCH --partition=performance
#SBATCH --out=slurm/logs/model_training.ipynb_out.txt
#SBATCH --err=slurm/logs/model_training.ipynb_out.txt
#SBATCH --job-name="messis"
The counterpart in the notebook, see num_nodes and devices, must match the SLURM script:
trainer = Trainer(
logger=wandb_logger,
log_every_n_steps=1,
callbacks=[
LogMessisMetrics (hparams, params['paths']['dataset_info'], debug=False),
LogConfusionMatrix(hparams, params['paths']['dataset_info'], debug=False),
early_stopping
],
accumulate_grad_batches=hparams['accumulate_grad_batches'], # Gradient accumulation
max_epochs=hparams['max_epochs'],
accelerator="gpu",
strategy="ddp", # Use Distributed Data Parallel
num_nodes=1, # Number of nodes
devices=2, # Number of GPUs to use
precision='16-mixed' # Train with 16-bit precision (https://lightning.ai/docs/pytorch/stable/common/trainer.html#precision)
)
Then, make sure you are starting the python script with srun
in messis-lightning.sh
:
poetry run srun python model_training.py # Essential to use srun for multi-GPU training!
To start the training, run sbatch messis-lightning.sh
in the terminal of the SLURM login node.
Start debug server on your remote server:
For GPU Server:
python -m debugpy --listen 0.0.0.0:5678 --wait-for-client model_training.py`
For SLURM (untested):
srun --partition performance poetry run python -m debugpy --listen
Launch the "Remote Attach" debug configuration in your VS Code (see .vscode/launch.json
). VS Code will connect to the debug server on the remote server and you can debug as usual.