Open zjykzj opened 2 years ago
By the way, the accuracy of the implementation in this paper for cifar100 is too low. Did you refer to the implementation of other warehouses for comparison :relaxed:
Thanks for your reimplementation!
By the way, the accuracy of the implementation in this paper for cifar100 is too low. Did you refer to the implementation of other warehouses for comparison ☺️
Do you mean the baseline results are low? I know if we change the training policies of CIFAR-100, we can achieve higher results. But many previous works on kd using the same policies in this implementation. And we follow them for better comparison.
Thanks for your reimplementation!
By the way, the accuracy of the implementation in this paper for cifar100 is too low. Did you refer to the implementation of other warehouses for comparison relaxed
Do you mean the baseline results are low? I know if we change the training policies of CIFAR-100, we can achieve higher results. But many previous works on kd using the same policies in this implementation. And we follow them for better comparison.
got it
@akuxcw @littletomatodonkey Nice work !!!
Based on this repos, I tried a new implementation ZJCV/KnowledgeReview. From the training results of cifar100, KR does achieve very excellent functions. For resnet50, the distillation results even exceed the teacher network