This repository provides official models from the paper Light-Weight RefineNet for Real-Time Semantic Segmentation
, available here
Light-Weight RefineNet for Real-Time Semantic Segmentation
Vladimir Nekrasov, Chunhua Shen, Ian Reid
In BMVC 2018
If you want to train the network on your own dataset, specify the arguments (see the available options in src_v2/arguments.py) and provide implementation of your dataset in src_v2/data.py if it is not supported by either densetorch or torchvision.
For flawless reproduction of our results, the Ubuntu OS is recommended. The models have been tested using Python 2.7 and Python 3.6.
pip, pip3
torch>=0.4.0
To install required Python packages, please run pip install -r requirements.txt
(Python2), or pip3 install -r requirements3.txt
(Python3) - use the flag -u
for local installation.
The given examples can be run with, or without GPU.
For the ease of reproduction, we have embedded all our examples inside Jupyter notebooks. One can either download them from this repository and proceed working with them on his/her local machine/server, or can resort to online version supported by the Google Colab service.
If all the installation steps have been smoothly executed, you can proceed with running any of the notebooks provided in the examples/notebooks
folder.
To start the Jupyter Notebook server, on your local machine run jupyter notebook
. This will open a web page inside your browser. If it did not open automatically, find the port number from the command's output and paste it into your browser manually.
After that, navigate to the repository folder and choose any of the examples given.
The number of FLOPs and runtime are measured on 625x468 inputs using a single GTX1080Ti, mean IoU is given on corresponding validation sets with a single scale input.
Models | PASCAL VOC | Person-Part | PASCAL Context | NYUv2, 40 | Params, M | FLOPs, B | Runtime, ms |
---|---|---|---|---|---|---|---|
RF-LW-ResNet-50 | 78.5 | 64.9 | - | 41.7 | 27 | 33 | 19.56±0.29 |
RF-LW-ResNet-101 | 80.3 | 66.7 | 45.1 | 43.6 | 46 | 52 | 27.16±0.19 |
RF-LW-ResNet-152 | 82.1 | 67.6 | 45.8 | 44.4 | 62 | 71 | 35.82±0.23 |
RF-LW-MobileNet-v2 | 76.2 | - | - | - | 3.3 | 9.3 | - |
Inside the notebook, one can try out their own images, write loops to iterate over videos / whole datasets / streams (e.g., from webcam). Feel free to contribute your cool use cases of the notebooks!
If you do not want to be involved in any hassle regarding the setup of the Jupyter Notebook server, you can proceed by using the same examples inside the Google colab environment - with free GPUs available!
We provide training scripts to get you started on the NYUv2-40 dataset. The methodology slightly differs from the one described in the paper and leads to better and more stable results (at least, on NYU).
In particular, here we i) start with a lower learning rate (as we initialise weights using default PyTorch's intiialisation instead of normal(0.01)), ii) add more aggressive augmentation (random scale between 0.5 and 2.0), and iii) pad each image inside the batch to a fixed crop size (instead of resizing all of them). The training process is divided into 3 stages: after each the optimisers are re-created with the learning rates halved. All the training is done using a single GTX1080Ti GPU card. Additional experiments with this new methodology on the other datasets (and with the MobileNet-v2 backbone) are under way, and relevant scripts will be provided once available. Please also note that the training scripts were written in Python 3.6.
To start training on NYU:
python src/setup.py build_ext --build-lib=./src/
.src/config.py
or train/nyu.sh
./train/nyu.sh
. On a single 1080Ti, the training takes around 3-6 hours (ResNet-50 - ResNet-152, correspondingly).If you want to train the networks using your dataset, you would need to modify the following:
TRAIN_DIR
and VAL_DIR
in src/config.py
can be used to prepend the relative paths. It is up to you to decide how to encode the segmentation masks - in the NYU example, the masks are encoded without a colourmap, i.e., with a single digit (label) per 2-D location;src/datasets.py
: in particular, pay attention to how the images and masks are being read from the files;src/config.py
for your needs - do not forget about changing the number of classes (NUM_CLASSES
);train/nyu.sh
for example. Once time permits, more things will be added to this repository:
src/train.py
provides the flag --evaluate
)For academic usage, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial usage, please contact the authors.