Cross-Image Relational Knowledge Distillation for Semantic Segmentation

This repository contains the source code of CIRKD (Cross-Image Relational Knowledge Distillation for Semantic Segmentation) and implementations of semantic segmentation tasks on some datasets.

Requirement

Ubuntu 18.04 LTS

Python 3.8 (Anaconda is recommended)

CUDA 11.1

PyTorch 1.8.0

NCCL for CUDA 11.1

Install python packages:

pip install timm==0.3.2
pip install mmcv-full==1.2.7
pip install opencv-python==4.5.1.48

Backbones pretrained on ImageNet:

CNN	Transformer
resnet101-imagenet.pth	mit_b0.pth
resnet18-imagenet.pth	mit_b1.pth
mobilenetv2-imagenet.pth	mit_b4.pth

Support datasets:

Dataset	Train Size	Val Size	Test Size	Class
Cityscapes	2975	500	1525	19
Pascal VOC Aug	10582	1449	--	21
CamVid	367	101	233	11
ADE20K	20210	2000	--	150
COCO-Stuff-164K	118287	5000	--	182

Performance on Cityscapes

All models are trained over 8 * NVIDIA GeForce RTX 3090

Role	Network	Method	Val mIoU	test mIoU	Pretrained	train script
Teacher	DeepLabV3-ResNet101	-	78.07	77.46	Google Drive	sh
Student	DeepLabV3-ResNet18	Baseline	74.21	73.45	-	sh
Student	DeepLabV3-ResNet18	CIRKD	76.38	75.05	Google Drive	sh
Student	DeepLabV3-MobileNetV2	Baseline	73.12	72.36	-	sh
Student	DeepLabV3-MobileNetV2	CIRKD	75.42	74.03	Google Drive	sh
Student	PSPNet-ResNet18	Baseline	72.55	72.29	-	sh
Student	PSPNet-ResNet18	CIRKD	74.73	74.05	Google Drive	sh

Performance of Segmentation KD methods on Cityscapes

Method	Val mIoU	Val mIoU	Val mIoU	train script
Teacher	DeepLabV3-ResNet101	DeepLabV3-ResNet101	SegFormer-MiT-B4
Baseline	78.07	78.07	81.23 [pretrained]	sh
Student	DeepLabV3-ResNet18	DeepLabV3-MobileNetV2	SegFormer-MiT-B0
Baseline	74.21	73.12	75.58 [pretrained]	sh
SKD [3]	75.42	73.82	76.43 [pretrained]	sh
IFVD [4]	75.59	73.50	76.30 [pretrained]	sh
CWD [5]	75.55	74.66	74.80 [pretrained]	sh
DSD [6]	74.81	74.11	76.62 [pretrained]	sh
CIRKD [7]	76.38	75.42	76.92 [pretrained]	sh

The references are shown in references.md

Evaluate pre-trained models on Cityscapes test sets

You can run test_cityscapes.sh. You can zip the resulting images and submit it to the Cityscapes test server.

Note: The current codes have been reorganized and we have not tested them thoroughly. If you have any questions, please contact us without hesitation.

Performance of Segmentation KD methods on Pascal VOC

The Pascal VOC dataset for segmentation is available at Baidu Drive

Role	Network	Method	Val mIoU	train script	Pretrained
Teacher	DeepLabV3-ResNet101	-	77.67	sh	Google Drive
Student	DeepLabV3-ResNet18	Baseline	73.21	sh
Student	DeepLabV3-ResNet18	CIRKD	74.50	sh
Student	PSPNet-ResNet18	Baseline	73.33	sh
Student	PSPNet-ResNet18	CIRKD	74.78	sh

Performance of Segmentation KD methods on CamVid

The CamVid dataset for segmentation is available at Baidu Drive

Role	Network	Method	Val mIoU	train script	Pretrained
Teacher	DeepLabV3-ResNet101	-	69.84	sh	Google Drive
Student	DeepLabV3-ResNet18	Baseline	66.92	sh
Student	DeepLabV3-ResNet18	CIRKD	68.21	sh
Student	PSPNet-ResNet18	Baseline	66.73	sh
Student	PSPNet-ResNet18	CIRKD	68.65	sh

Performance of Segmentation KD methods on ADE20K

The ADE20K dataset for segmentation is available at Google Drive

Role	Network	Method	Val mIoU	train script	Pretrained
Teacher	DeepLabV3-ResNet101	-	42.70	sh	Google Drive
Student	DeepLabV3-ResNet18	Baseline	33.91	sh
Student	DeepLabV3-ResNet18	CIRKD	35.41	sh

Performance of Segmentation KD methods on COCO-Stuff-164K

Role	Network	Method	Val mIoU	train script	Pretrained
Teacher	DeepLabV3-ResNet101	-	38.71	sh	Google Drive
Student	DeepLabV3-ResNet18	Baseline	32.60	sh
Student	DeepLabV3-ResNet18	CIRKD	33.11	sh

Visualization of segmentation mask using pretrained models

Dataset	Color Pallete	Blend	Scripts
Pascal VOC			sh
Cityscapes			sh
ADE20K			sh
COCO-Stuff-164K			sh

Citation

We would appreciate it if you could give this repo a star or cite our paper!

@inproceedings{yang2022cross,
  title={Cross-image relational knowledge distillation for semantic segmentation},
  author={Yang, Chuanguang and Zhou, Helong and An, Zhulin and Jiang, Xue and Xu, Yongjun and Zhang, Qian},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={12319--12328},
  year={2022}
}

winycg / CIRKD

readme