WangYZ1608 / Knowledge-Distillation-via-ND

The official implementation for paper: Improving Knowledge Distillation via Regularizing Feature Norm and Direction
13 stars 2 forks source link

Could you kindly provide the training log? #7

Closed HanGuangXin closed 1 year ago

HanGuangXin commented 1 year ago

I am using the KD to transfer knowledge from the ViT-Base model to ResNet18 on the ImageNet dataset by following the ImageNet/ViT/train.sh script. However, despite training for several epochs, the top-1 accuracy has only reached 46.2% in epoch 8, and it seems to be increasing slowly. This performance is making me concerned about the final accuracy of the model.

image

WangYZ1608 commented 1 year ago

This is the top-1 acc on train set for DKD++, and and your results look fine. DKD