Reproducing Imagenet Knowledge Transfer Top-1 Accuracy

NVlabs / DeepInversion

Official PyTorch implementation of Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion (CVPR 2020)

Other

479 stars 77 forks source link

Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

Open amiller195 opened 3 years ago

amiller195 commented 3 years ago

Hi, Very interesting work! According to Table 6 in the paper, training for 90 epochs with the 140K generated dataset should reach top-1 accuracy of 68.0%. I'm trying to train Resnet50v1.5 based on the protocol here https://github.com/NVIDIA/DeepLearningExamples with the 140k dataset, can't pass top-1 accuracy of 10%.

Can you please elaborate on the training process using the generated 140k images? What protocol or additional work was required to reach the mentioned accuracy?

Thanks!

hongxuyin commented 3 years ago

Using KL divergence instead of CE, and rescaling KL divergence into normal loss ranges - distillation setup details in Sec 4.4.

tronguyen commented 3 years ago

Hi, thank you for the great work!

Sorry I also have the same question as above and wonder if the question is resolved.

I couldn't reproduce the accuracy on Imagenet with the 140k images provided. I only can reach over 30% top-1 accuracy as followed in Sec 4.4 from the paper. My training setups include: batch size 256, temperature 3, KL loss only (only relies on teacher logits), 250 epochs, learning rate 1.0 and SGD with a decay step of every 80 epochs.

Many thanks!

CHENBIN99 commented 1 year ago

same question