Official code for cvpr2024 paper Scale Decoupled Distillation
On CIFAR100
On ImageNet
On CUB200
Environments:
Training on CIFAR-100
Fetch the pretrained teacher models by:
sh fetch_pretrained_teachers.sh
which will download and save the models to save/models
Run distillation by following commands in teacher_resnet32x4.sh,teacher_unpair.sh,teacher_vgg.sh, and teacher_wrn.sh. An example of is given by
python train_origin.py --cfg configs/cifar100/sdd_dkd/res32x4_shuv1.yaml --gpu 1 --M [1,2,4]
Training on ImageNet
python -m torch.distributed.launch --nproc_per_node=2 train.py --cfg ./configs/imagenet/r50_mv2/sdd_dkd.yaml
Training on CUB200
Core code
modified the teacher and student
self.spp=SPP(M=M)
x_spp,x_strength = self.spp(x)
x_spp = x_spp.permute((2, 0, 1))
m, b, c = x_spp.shape[0], x_spp.shape[1], x_spp.shape[2]
x_spp = torch.reshape(x_spp, (m * b, c))
patch_score = self.fc(x_spp)
patch_score = torch.reshape(patch_score, (m, b, self.class_num))
patch_score = patch_score.permute((1, 2, 0))
Thanks for CRD and DKD. We build this library based on the CRD's codebase and the DKD's codebase