Closed liguopeng0923 closed 5 months ago
Similarly, when the teacher is convnext-tiny (89.96) and the student is swin-p with KD, the acc@1 is 74 which is lower than 76.44 as you reported.
Thanks you very much !!
Hi @Hao840 ,
I aligned the training settings and GPUs with you, but it is still not normal (DeiT-T). For example, the training loss is 6.x, but yours is 4.x at the beginning stages. Could you please check it again?
When the teacher is convnext-tiny and the student is DeiT-T, my kd result is 71.74 but yours is 72.99 in cifar100.
I think the main reason is the lr and batch size. After I use batch size 64, I get the normal result. Actually, this setting is the same as DEIT.
Hi @Hao840 ,
I reproduced the results in cifar100 with kd (teacher convnext-tiny 89.96), but they are not normal compared to your paper. For example, acc@1 is 70.78, which is lower than the reported 72.99 when the student is deit-tiny. I guess that the training settings may be incorrect when the student is vit-based. Can you give me the "args.yaml" and log file?