Having trouble reproducing the reported accuracy

allenai / XNOR-Net

ImageNet classification using binary Convolutional Neural Networks

https://xnor.ai/

Other

856 stars 239 forks source link

Having trouble reproducing the reported accuracy #2

Closed wishforgood closed 7 years ago

wishforgood commented 7 years ago

I ran the code with command th main.lua -data ./images -nGPU 2 -batchSize 512 -netType alexnet -binaryWeight -dropout 0.1 after changed the learning rate policy to be

1, 4, 1e-1, 5e-4,
5, 8, 1e-3, 5e-4,
9, 12, 1e-5, 0.
13, 16, 1e-7, 0

I tried to use this to get the same result for BWN(alexnet) as reported in the paper. However, the resulting top-1 train accurcy after the first epoch is 7.82%, far from reported. The top-5 training accuracy is 19.72%. Is there anything I missed?

mrastegari commented 7 years ago

Have you tried to load the pretrained model to check that if you can get the same accuracy? try with 1 GPU batchSize 128 with the default LR regime. epochNumber 10000. I have noticed some times multiGPU has problem in aggregating the gradients in the backward pass in different machines.

wishforgood commented 7 years ago

Thanks for your answer! Yes, I've checked the pretrained model and it works, I got the same accuracy. I've also tried 1 GPU with batchSize 128 with the default LR regime, the accuracy is increasing. I will train it further to get the accuracy curve, but is there any curve I can depend on? It would be better if I can check if the training is working in the early epoches of training.

mrastegari commented 7 years ago

I will try to create the learning curve and share it.

wishforgood commented 7 years ago

OK, please inform me at that time.

mrastegari commented 7 years ago

please see the discussion here https://github.com/mrastegari/XNOR-Net/issues/3

wishforgood commented 7 years ago

OK, I will have a check, thanks very much!