megvii-research / mdistiller

The official implementation of [CVPR2022] Decoupled Knowledge Distillation https://arxiv.org/abs/2203.08679 and [ICCV2023] DOT: A Distillation-Oriented Trainer https://openaccess.thecvf.com/content/ICCV2023/papers/Zhao_DOT_A_Distillation-Oriented_Trainer_ICCV_2023_paper.pdf
807 stars 123 forks source link

On the issue of code reproduction #42

Open xiaohe725 opened 1 year ago

xiaohe725 commented 1 year ago

Hello, I am around 0.2~0.7 percentage points lower than the results given in the paper on all models. May I know the possible reason? I used one GPU for the analysis. Thank you for your answer.

Zzzzz1 commented 1 year ago

See issue #7 .

xiaohe725 commented 1 year ago

I have seen it, but still cannot solve my problem. May I know where else the problem may arise

Zzzzz1 commented 1 year ago

I have seen it, but still cannot solve my problem. May I know where else the problem may arise

It seems like the results on CIFAR100 are unstable. Did you check the result on ImageNet (more stable)?

xiaohe725 commented 1 year ago

I have seen it, but still cannot solve my problem. May I know where else the problem may arise

It seems like the results on CIFAR100 are unstable. Did you check the result on ImageNet (more stable)?

At present, there is no such experiment. I mainly conduct experiments on CIFAR. How many times can you conduct experiments on CIFAR to obtain the results in the paper

Zzzzz1 commented 1 year ago

I have seen it, but still cannot solve my problem. May I know where else the problem may arise

It seems like the results on CIFAR100 are unstable. Did you check the result on ImageNet (more stable)?

At present, there is no such experiment. I mainly conduct experiments on CIFAR. How many times can you conduct experiments on CIFAR to obtain the results in the paper

We ran each experiment 5 times and report the average accuracy.