A new re-implementation for KnowledgeReview

zjykzj commented 2 years ago

@akuxcw @littletomatodonkey Nice work !!!

Based on this repos, I tried a new implementation ZJCV/KnowledgeReview. From the training results of cifar100, KR does achieve very excellent functions. For resnet50, the distillation results even exceed the teacher network

arch_s	top1	top5	arch_t	top1	top5	dataset	lambda	top1	top5
MobileNetv2	80.620	95.820	ResNet50	83.540	96.820	CIFAR100	7.0	83.370	96.810
MobileNetv2	80.620	95.820	ResNet152	85.490	97.590	CIFAR100	8.0	84.530	97.470
MobileNetv2	80.620	95.820	ResNeXt_32x8d	85.720	97.650	CIFAR100	6.0	84.520	97.470
ResNet18	80.540	96.040	ResNet50	83.540	96.820	CIFAR100	10.0	83.130	96.350
ResNet50	83.540	96.820	ResNet152	85.490	97.590	CIFAR100	6.0	86.240	97.610
ResNet50	83.540	96.820	ResNeXt_32x8d	85.720	97.650	CIFAR100	6.0	86.220	97.490

zjykzj commented 2 years ago

By the way, the accuracy of the implementation in this paper for cifar100 is too low. Did you refer to the implementation of other warehouses for comparison :relaxed:

akuxcw commented 2 years ago

Thanks for your reimplementation!

By the way, the accuracy of the implementation in this paper for cifar100 is too low. Did you refer to the implementation of other warehouses for comparison ☺️

Do you mean the baseline results are low? I know if we change the training policies of CIFAR-100, we can achieve higher results. But many previous works on kd using the same policies in this implementation. And we follow them for better comparison.

zjykzj commented 2 years ago

Thanks for your reimplementation!

By the way, the accuracy of the implementation in this paper for cifar100 is too low. Did you refer to the implementation of other warehouses for comparison relaxed

Do you mean the baseline results are low? I know if we change the training policies of CIFAR-100, we can achieve higher results. But many previous works on kd using the same policies in this implementation. And we follow them for better comparison.

got it

dvlab-research / ReviewKD

A new re-implementation for KnowledgeReview #13