This is the code for the NeurIPS 2019 paper Region Mutual Information Loss for Semantic Segmentation.
MIT License
Region Mutual Information Loss for Semantic Segmentation

This paper proposes a region mutual information (RMI) loss to model the dependencies among pixels. RMI uses one pixel and its neighbor pixels to represent this pixel. Then for each pixel in an image, we get a multi-dimensional point that encodes the relationship between pixels, and the image is cast into a multi-dimensional distribution of these high-dimensional points. The prediction and ground truth thus can achieve high order consistency through maximizing the mutual information (MI) between their multi-dimensional distributions.


Features and TODO

We are open to pull requests.


Install dependencies

Please install PyTorch-1.1.0 and Python3.6.5. We highly recommend you to use our established PyTorch docker image - zhaosssss/torch_lab.

docker pull zhaosssss/torch_lab:1.1.0

If you have not installed docker, see https://docs.docker.com/.

After you install docker and pull our image, you can cd to script directory and run


to create a running docker container.

If you do not want to use docker, try

pip install -r requirements.txt

However, this is not suggested.

Prepare data

Generally, directories are organized as follow:

|--dataset (save the dataset) 
|--models  (save the output checkpoints)
|--github  (save the code)
|--|--RMI  (the RMI code repository)

As for the CamVid dataset, you can download at SegNet-Tutorial. This is a processed version of original CamVid dataset.


See script/train.sh for detailed information. Before start training, you should specify some variables in the script/train.sh.

You can find more information about the arguments of the code in parser_params.py.

python parser_params.py --help

After you set all the arguments properly, you can simply cd to RMI/script and run


to start training.

tensorboard --logdir=your_logdir --port=your_port


Training a DeepLabv3 model with output_stride=16, crop_size=513, and batch_size=16 needs 4 GTX 1080 GPUs (8GB) or 2 GTX TITAN X GPUs (12 GB) or 1 TITAN RTX GPUs (24 GB).

Evaluation and Inference

See script/eval.sh and script/inference.sh for detailed information.

You should also specify some variables in the scripts.

Some selected qualitative results on PASCAL VOC 2012 val set. Segmentation results of DeepLabv3+&RMI have richer details than DeepLabv3+&CE, e.g., small bumps of the airplane wing, branches of plants, limbs of cows and sheep, and so on.


If our paper and code are beneficial to your work, please cite:

  author    = {Shuai Zhao and
               Yang Wang and
               Zheng Yang and
               Deng Cai},
  title     = {Region Mutual Information Loss for Semantic Segmentation},
  booktitle = {NeurIPS},
  year      = {2019},

If other related work in our code or paper also helps you, please cite the corresponding papers.

