Reproducing results of "Understanding the Role of the Projector in Knowledge Distillation"

yoshitomo-matsubara / torchdistill

A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

https://yoshitomo-matsubara.net/torchdistill/

MIT License

1.37k stars 132 forks source link

Reproducing results of "Understanding the Role of the Projector in Knowledge Distillation" #446

Closed roymiles closed 6 months ago

roymiles commented 6 months ago

I have reproduced the results in the original paper. The original paper reports an accuracy of 71.63%, while this config leads to 71.65%.

The log and checkpoint for this run can be found here: https://drive.google.com/drive/folders/18xl0CDZ6CioP4Sbjdpj1Pndp4biSLpnV?usp=sharing

Trained on a single GPU.

yoshitomo-matsubara commented 6 months ago

Thanks @roymiles for the updates!

Did you use 3 GPUs for distributed training? If nos, I will remove the section from README later, when I make minor changes. https://github.com/yoshitomo-matsubara/torchdistill/pull/446/files#diff-1ecd33e0a6aeb10ddebfcdc6ed245a3e8ea60e38a09ed8974047a3101ec638aeR41-R53

roymiles commented 6 months ago

Ah oops, I must have overlooked that. Yea I only used 1 GPU.

yoshitomo-matsubara commented 6 months ago

No problem, I will merge this PR and make some modifications. The next version of torchdistill will be released in a few days, and I will upload the checkpoint and log as part of the release note for backup.

It's a great job! Thanks for your contribution!