This repository contains the source code of CIRKD (Cross-Image Relational Knowledge Distillation for Semantic Segmentation) and implementations of semantic segmentation tasks on some datasets.
Ubuntu 18.04 LTS
Python 3.8 (Anaconda is recommended)
CUDA 11.1
PyTorch 1.8.0
NCCL for CUDA 11.1
Install python packages:
pip install timm==0.3.2
pip install mmcv-full==1.2.7
pip install opencv-python==4.5.1.48
Backbones pretrained on ImageNet:
CNN | Transformer |
---|---|
resnet101-imagenet.pth | mit_b0.pth |
resnet18-imagenet.pth | mit_b1.pth |
mobilenetv2-imagenet.pth | mit_b4.pth |
Support datasets:
Dataset | Train Size | Val Size | Test Size | Class |
---|---|---|---|---|
Cityscapes | 2975 | 500 | 1525 | 19 |
Pascal VOC Aug | 10582 | 1449 | -- | 21 |
CamVid | 367 | 101 | 233 | 11 |
ADE20K | 20210 | 2000 | -- | 150 |
COCO-Stuff-164K | 118287 | 5000 | -- | 182 |
All models are trained over 8 * NVIDIA GeForce RTX 3090
Role | Network | Method | Val mIoU | test mIoU | Pretrained | train script |
---|---|---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 78.07 | 77.46 | Google Drive | sh |
Student | DeepLabV3-ResNet18 | Baseline | 74.21 | 73.45 | - | sh |
Student | DeepLabV3-ResNet18 | CIRKD | 76.38 | 75.05 | Google Drive | sh |
Student | DeepLabV3-MobileNetV2 | Baseline | 73.12 | 72.36 | - | sh |
Student | DeepLabV3-MobileNetV2 | CIRKD | 75.42 | 74.03 | Google Drive | sh |
Student | PSPNet-ResNet18 | Baseline | 72.55 | 72.29 | - | sh |
Student | PSPNet-ResNet18 | CIRKD | 74.73 | 74.05 | Google Drive | sh |
Method | Val mIoU | Val mIoU | Val mIoU | train script |
---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | DeepLabV3-ResNet101 | SegFormer-MiT-B4 | |
Baseline | 78.07 | 78.07 | 81.23 [pretrained] | sh |
Student | DeepLabV3-ResNet18 | DeepLabV3-MobileNetV2 | SegFormer-MiT-B0 | |
Baseline | 74.21 | 73.12 | 75.58 [pretrained] | sh |
SKD [3] | 75.42 | 73.82 | 76.43 [pretrained] | sh |
IFVD [4] | 75.59 | 73.50 | 76.30 [pretrained] | sh |
CWD [5] | 75.55 | 74.66 | 74.80 [pretrained] | sh |
DSD [6] | 74.81 | 74.11 | 76.62 [pretrained] | sh |
CIRKD [7] | 76.38 | 75.42 | 76.92 [pretrained] | sh |
The references are shown in references.md
You can run test_cityscapes.sh. You can zip the resulting images and submit it to the Cityscapes test server.
Note: The current codes have been reorganized and we have not tested them thoroughly. If you have any questions, please contact us without hesitation.
The Pascal VOC dataset for segmentation is available at Baidu Drive
Role | Network | Method | Val mIoU | train script | Pretrained |
---|---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 77.67 | sh | Google Drive |
Student | DeepLabV3-ResNet18 | Baseline | 73.21 | sh | |
Student | DeepLabV3-ResNet18 | CIRKD | 74.50 | sh | |
Student | PSPNet-ResNet18 | Baseline | 73.33 | sh | |
Student | PSPNet-ResNet18 | CIRKD | 74.78 | sh |
The CamVid dataset for segmentation is available at Baidu Drive
Role | Network | Method | Val mIoU | train script | Pretrained |
---|---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 69.84 | sh | Google Drive |
Student | DeepLabV3-ResNet18 | Baseline | 66.92 | sh | |
Student | DeepLabV3-ResNet18 | CIRKD | 68.21 | sh | |
Student | PSPNet-ResNet18 | Baseline | 66.73 | sh | |
Student | PSPNet-ResNet18 | CIRKD | 68.65 | sh |
The ADE20K dataset for segmentation is available at Google Drive
Role | Network | Method | Val mIoU | train script | Pretrained |
---|---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 42.70 | sh | Google Drive |
Student | DeepLabV3-ResNet18 | Baseline | 33.91 | sh | |
Student | DeepLabV3-ResNet18 | CIRKD | 35.41 | sh |
Role | Network | Method | Val mIoU | train script | Pretrained |
---|---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 38.71 | sh | Google Drive |
Student | DeepLabV3-ResNet18 | Baseline | 32.60 | sh | |
Student | DeepLabV3-ResNet18 | CIRKD | 33.11 | sh |
Dataset | Color Pallete | Blend | Scripts |
---|---|---|---|
Pascal VOC | sh | ||
Cityscapes | sh | ||
ADE20K | sh | ||
COCO-Stuff-164K | sh |
We would appreciate it if you could give this repo a star or cite our paper!
@inproceedings{yang2022cross,
title={Cross-image relational knowledge distillation for semantic segmentation},
author={Yang, Chuanguang and Zhou, Helong and An, Zhulin and Jiang, Xue and Xu, Yongjun and Zhang, Qian},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={12319--12328},
year={2022}
}