winycg / CIRKD

[CVPR-2022] Official implementations of CIRKD: Cross-Image Relational Knowledge Distillation for Semantic Segmentation and implementations on Cityscapes, ADE20K, COCO-Stuff., Pascal VOC and CamVid.
180 stars 26 forks source link

Cross-Image Relational Knowledge Distillation for Semantic Segmentation

This repository contains the source code of CIRKD (Cross-Image Relational Knowledge Distillation for Semantic Segmentation) and implementations of semantic segmentation tasks on some datasets.

Requirement

Ubuntu 18.04 LTS

Python 3.8 (Anaconda is recommended)

CUDA 11.1

PyTorch 1.8.0

NCCL for CUDA 11.1

Install python packages:

pip install timm==0.3.2
pip install mmcv-full==1.2.7
pip install opencv-python==4.5.1.48

Backbones pretrained on ImageNet:

CNN Transformer
resnet101-imagenet.pth mit_b0.pth
resnet18-imagenet.pth mit_b1.pth
mobilenetv2-imagenet.pth mit_b4.pth

Support datasets:

Dataset Train Size Val Size Test Size Class
Cityscapes 2975 500 1525 19
Pascal VOC Aug 10582 1449 -- 21
CamVid 367 101 233 11
ADE20K 20210 2000 -- 150
COCO-Stuff-164K 118287 5000 -- 182

Performance on Cityscapes

All models are trained over 8 * NVIDIA GeForce RTX 3090

Role Network Method Val mIoU test mIoU Pretrained train script
Teacher DeepLabV3-ResNet101 - 78.07 77.46 Google Drive sh
Student DeepLabV3-ResNet18 Baseline 74.21 73.45 - sh
Student DeepLabV3-ResNet18 CIRKD 76.38 75.05 Google Drive sh
Student DeepLabV3-MobileNetV2 Baseline 73.12 72.36 - sh
Student DeepLabV3-MobileNetV2 CIRKD 75.42 74.03 Google Drive sh
Student PSPNet-ResNet18 Baseline 72.55 72.29 - sh
Student PSPNet-ResNet18 CIRKD 74.73 74.05 Google Drive sh

Performance of Segmentation KD methods on Cityscapes

Method Val mIoU Val mIoU Val mIoU train script
Teacher DeepLabV3-ResNet101 DeepLabV3-ResNet101 SegFormer-MiT-B4
Baseline 78.07 78.07 81.23 [pretrained] sh
Student DeepLabV3-ResNet18 DeepLabV3-MobileNetV2 SegFormer-MiT-B0
Baseline 74.21 73.12 75.58 [pretrained] sh
SKD [3] 75.42 73.82 76.43 [pretrained] sh
IFVD [4] 75.59 73.50 76.30 [pretrained] sh
CWD [5] 75.55 74.66 74.80 [pretrained] sh
DSD [6] 74.81 74.11 76.62 [pretrained] sh
CIRKD [7] 76.38 75.42 76.92 [pretrained] sh

The references are shown in references.md

Evaluate pre-trained models on Cityscapes test sets

You can run test_cityscapes.sh. You can zip the resulting images and submit it to the Cityscapes test server.

Note: The current codes have been reorganized and we have not tested them thoroughly. If you have any questions, please contact us without hesitation.

Performance of Segmentation KD methods on Pascal VOC

The Pascal VOC dataset for segmentation is available at Baidu Drive

Role Network Method Val mIoU train script Pretrained
Teacher DeepLabV3-ResNet101 - 77.67 sh Google Drive
Student DeepLabV3-ResNet18 Baseline 73.21 sh
Student DeepLabV3-ResNet18 CIRKD 74.50 sh
Student PSPNet-ResNet18 Baseline 73.33 sh
Student PSPNet-ResNet18 CIRKD 74.78 sh

Performance of Segmentation KD methods on CamVid

The CamVid dataset for segmentation is available at Baidu Drive

Role Network Method Val mIoU train script Pretrained
Teacher DeepLabV3-ResNet101 - 69.84 sh Google Drive
Student DeepLabV3-ResNet18 Baseline 66.92 sh
Student DeepLabV3-ResNet18 CIRKD 68.21 sh
Student PSPNet-ResNet18 Baseline 66.73 sh
Student PSPNet-ResNet18 CIRKD 68.65 sh

Performance of Segmentation KD methods on ADE20K

The ADE20K dataset for segmentation is available at Google Drive

Role Network Method Val mIoU train script Pretrained
Teacher DeepLabV3-ResNet101 - 42.70 sh Google Drive
Student DeepLabV3-ResNet18 Baseline 33.91 sh
Student DeepLabV3-ResNet18 CIRKD 35.41 sh

Performance of Segmentation KD methods on COCO-Stuff-164K

Role Network Method Val mIoU train script Pretrained
Teacher DeepLabV3-ResNet101 - 38.71 sh Google Drive
Student DeepLabV3-ResNet18 Baseline 32.60 sh
Student DeepLabV3-ResNet18 CIRKD 33.11 sh

Visualization of segmentation mask using pretrained models

Dataset Color Pallete Blend Scripts
Pascal VOC top1 top1 sh
Cityscapes top1 top1 sh
ADE20K top1 top1 sh
COCO-Stuff-164K top1 top1 sh

Citation

We would appreciate it if you could give this repo a star or cite our paper!

@inproceedings{yang2022cross,
  title={Cross-image relational knowledge distillation for semantic segmentation},
  author={Yang, Chuanguang and Zhou, Helong and An, Zhulin and Jiang, Xue and Xu, Yongjun and Zhang, Qian},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={12319--12328},
  year={2022}
}