Different training settings with VGG-Small in LQ-Nets, and cannot reach 90.4%?

licj15 commented 4 years ago

Hello, everyone, Thank @htqin 's great work!

I noticed the training setting of vgg-small is using cosine decay + 300 max epochs, while the cited reference in the paper, use step decay + 400 max epochs.

And by running the main.py, I can only get 87.80% while the reported number in the paper is 90.40%. But the full precision version I reproduced using this training setting could have 91.79% while the reported number in the paper is 91.70%. In other words, I can reproduce the reported full precision baselines for vgg-small using the training setting in this repo, but cannot reproduce the reported binarized IR-Net for vgg-small. That makes me very confusing, I was wondering if there is any mismatch in the training setting of vgg-small@cifar-10? I am also very looking forward to discussing your training settings when you reproduce vgg-small@cifar-10!

Thank you! Best regards,

licj15 commented 4 years ago

I was wondering if there was anyone who could share their training hyperparams on reproducing the numbers? I tried some other hyperparams combinations and get the following result Tuning epochs to 400, 600, 800, instead of (300) in the paper, and get max acc 89.75% (800 epochs). Still no as good as the one reported in the paper. I am also glad to see more suggestions for it! Thank you!

htqin commented 4 years ago

(1) The training settings in the existing methods are not equal, we use the original results reported in their papers. All our training settings are in the source code, which is included in our paper. (2) We reported the best performance of our models in the paper. Adjusting the weight decay and learning rate is effective to train models with higher accuracy.

htqin / IR-Net

Different training settings with VGG-Small in LQ-Nets, and cannot reach 90.4%? #8