juntang-zhuang / Adabelief-Optimizer

Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"
BSD 2-Clause "Simplified" License
1.05k stars 108 forks source link

The problem of reproducing the result of ImageNet #65

Open KaltsitI opened 1 year ago

KaltsitI commented 1 year ago

Recently I try reproducing the result in the paper. I successfully did this on cifar10 and GAN, but the test accuracy on ImageNet is nearly 69.5%, which on the paper is 70.08 %. I wonder whether I used the wrong edition of AdaBelief or the parameters in run.sh has been changed. Could I ask you for the edition of AdaBelief and the parameters if I want to reproduce the result on ImageNet (pytorch)? The edition of AdaBelief I used is 0.2.0.

juntang-zhuang commented 1 year ago

Hi, the original code I used for ImageNet is in the example folder, but should be the same as 0.2.0, except that 0.2.0 uses a default eps=1e-16, the example used 1e-8, but I think it is override to 1e-8 explicitly (rather than using default) in the script.

KaltsitI commented 1 year ago

Hi, the original code I used for ImageNet is in the example folder, but should be the same as 0.2.0, except that 0.2.0 uses a default eps=1e-16, the example used 1e-8, but I think it is override to 1e-8 explicitly (rather than using default) in the script.

Thank you for your reply. I also think it has been overrided to 1e-8. So do you think the edition of the PyTorch could make a difference to the result? And mine is 1.11.0.

juntang-zhuang commented 1 year ago

I'm not quite sure, pytorch version might case slight difference. BTW, I remember there are different versions of ImageNet, after 2018 or some year some of the data is re-labeled or replaced. Did you use the latest version with corrections?

KaltsitI commented 1 year ago

I used the ILSVRC2012 edition of ImageNet.