Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332
make_datasets
(as well as doodleverse_utils\make_mndwi_dataset
and doodleverse_utils\make_ndwi_dataset
) now works in a new way. Before, all files were read in, shuffled, split into train and val sets, then non-augmented and augmented npz files were created for each set. This causes a potential data leak between train and validation subsets, and validation was carried out on augmented imagery. We introduced a clunky 'mode' config parameter to try to control the degree of use of augmentation.
From May 29, 2023, make_datasets
will create train_data
and val_data
subfolders, then copies splits of train and validation labels and images over (multiple bands of images if necessary). It makes non-augmented npzs for each, then makes augmented npzs for the training set only. This removes the potential data leak, and validation is carried out on non-augmented imagery, which is a better reflection of deployment. Like before, make_datasets
does not make a test dataset. The test dataset is a domain/task specific problem: please make an independent test set for your problem.
Gym is a toolbox to segment imagery with a variety of a family of UNet models, which are supervised deep-learning models for image segmentation. Gym supports segmentation of image with any number of bands, and any number of classes (memory limited). We have built an end-to-end workflow that facilitates a fully reproducible label-to-model workflow when used in conjunction with companion program Doodler, however pairs of images and corresponding labels however-acquired may be used with Gym.
.npz
) that contain all your data for model training and validation, and that can be unpacked directory as tensorflow tensors. We initially used tfrecord format files, but abandoned the approach because of the relative complexity, and because the npz format is more familiar to Earth scientists who code with python.We have tested on a variety of Earth and environmental imagery of coastal, river, and other natural environments. However, we expect the toolbox to be useful for all types of imagery when properly applied.
Package maintainers:
Contributions:
doodleverse_utils
functions in model_metrics.py
use minimally modified code from hereThis toolbox is designed for 1,3, or 4-band imagery, and supports both binary
(one class of interest and a null class) and multiclass
(several classes of interest).
We recommend a 6 part workflow:
config
file for your data. You will need to make some decisions about the model and hyperparameters.make_dataset.py
to augment and package your images into npz files for training the model. train_model.py
to train a segmentation model. Or run batch_train_models.py
to train a batch of models (typically using the same dataset but with different config files specifying alternative hyperparameters)seg_images_in_folder.py
to segment images with your newly trained model, or ensemble_seg_images_in_folder.py
to point more than one trained model at the same imagery and ensemble the model outputsHere at Doodleverse HQ we advocate training models on the augmented data encoded in the datasets, so the original data is a hold-out or test set. This is ideal because although the validation dataset (drawn from augmented data) doesn't get used to adjust model weights, it does influence model training by triggering early stopping if validation loss is not improving. Testing on an untransformed set is also a further check/reassurance of model performance and evaluation metric
Doodleverse HQ also advocates the use of ensemble
models where possible, which requires training multiple models each with a config file, and model weights file
We advise creating a new conda environment to run the program. We recommend miniconda
Note that MACS are NOT SUPPORTED. Only Linux and WSL on Windows. Not sorry :)
gym
[OPTIONAL] First you may want to do some conda and pip housekeeping (recommended)
conda update -n base conda
conda clean --all
python3 -m pip install --upgrade pip
[OPTIONAL] Set mamba to the default installer:
conda install -n base conda-libmamba-solver
conda config --set solver libmamba
1) If you wish to use GPU for model training, you now must use Linux or WSL2 (Windows Subsystem for Linux 2) on Windows and refer to the official Tensorflow instructions:
(updated November 20, 2024)
conda env create --file ./install/gym.yml
Test the tensorflow installation:
conda activate gym
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
This should list all of your GPUs. If it does not, configure the system paths, as per the official Tensorflow instructions:
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
and try again
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
If the above fails,
conda create -n gym python=3.10 -y
conda activate gym
conda install -c conda-forge cudatoolkit=11.8.0 -y
python -m pip install nvidia-cudnn-cu11==8.6.0.163 tensorflow==2.12.*
Configure the system paths, as per the official Tensorflow instructions:
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
Verify install:
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
Then:
conda install -c conda-forge scikit-image ipython tqdm pandas natsort matplotlib transformers -y
python -m pip install doodleverse_utils
From here, you may encounter the following error:
Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice.
...
Couldn't invoke ptxas --version
...
InternalError: libdevice not found at ./libdevice.10.bc [Op:__some_op]
To fix this error, you will need to run the following commands:
# Install NVCC
conda install -c nvidia cuda-nvcc=11.3.58 -y
# Configure the XLA cuda directory
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
printf 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/\nexport XLA_FLAGS=--xla_gpu_cuda_data_dir=$CONDA_PREFIX/lib/\n' > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
# Copy libdevice file to the required path
mkdir -p $CONDA_PREFIX/lib/nvvm/libdevice
cp $CONDA_PREFIX/lib/libdevice.10.bc $CONDA_PREFIX/lib/nvvm/libdevice/
In my case, I also had to link the path to the lib folder in anaconda to LD_LIBRARY_PATH
:
ln -sf /usr/lib/x86_64-linux-gnu/libstdc++.so.6 ~/miniconda3/envs/gym/bin/../lib/libstdc++.so.6
Test transformers
python -c "from transformers import TFSegformerForSemanticSegmentation"
(this should return no errors. It may issue warnings about TensorflowRT - you can ignore those)
pip uninstall h5py --yes
conda install -c conda-forge h5py -y
git clone --depth 1 https://github.com/Doodleverse/segmentation_gym.git
(--depth 1
means "give me only the present code, not the whole history of git commits" - this saves disk space, and time)
mkdir -p $CONDA_PREFIX/lib/nvvm/libdevice/
cp -p $CONDA_PREFIX/lib/libdevice.10.bc $CONDA_PREFIX/lib/nvvm/libdevice/
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=' > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export XLA_FLAGS=--xla_gpu_cuda_data_dir=$CONDA_PREFIX/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
If you get errors associated with loading the model weights you may need to:
pip install "h5py==2.10.0" --force-reinstall
and just ignore any warnings.
Check out the wiki for a guide of how to use Gym
A test data set, including a set of images/labels, model config files, and a dataset and models created with Gym, are available here and described on the zenodo page
Please read our code of conduct
Please contribute to the Discussions tab - we welcome your ideas and feedback.
We also invite all to open issues for bugs/feature requests using the Issues tab