Open mrastegari opened 8 years ago
I trained one some time ago, the model is here https://gist.github.com/szagoruyko/dd032c529048492630fc, achieves 56.7% top1.
This model is different with one in this repository (alexnowtbn does not have nn.Concat and number of filters in convolutinal layers are different with your model) do you think we should expect this gap?
@mrastegari no that shouldnt be the issue. my bet would be the recent bugs in DPT, you might want to update everything and try again. btw you can increase the learning rate and half the number of epochs.
I updated all the libraries (cunn, nn, cudnn, cutorch) but yet I can not get the top-1 accuracy more than 45%.
yes I'm also getting a similar issue, alexnetowtbn is giving me low accuracy, trying to train with -netType alexnet to see if at least alexnet gives good performances...
I remember around two months ago I could get top-1(val) accuracy around 52% . So maybe something changed in some of the updates in the libraries.
hmm, tried it again and now alexnetowtbn converges fine, got to 38th epoch and the top-1 validation accuracy is at 53.93%.
Have you followed the learningReate regime exactly in the same way as in the code? I noticed some instability in training. For example, after one epoch if I stop and then call the retrain option it gives better accuracy than just let the code goes to the next epoch. Have you reinstall any of the libraries?
hmm, I've updated torch, nn, cutorch, and cudnn. But my version of imagenet-multiGPU was from a few months ago, I've just cloned the new version and just began a training of alexnetowbn to see if I can duplicate the results.
alexnetowbn trained and converged fine, btw.
Thanks for the effort !!!
Hey Guys, do you have any updated results ? I trained Alexnet (without batch normalization), and I get top-1 accuracy of 54.93 on val set.
I trained the Alexnet model with batch normalization (alexnetowtbn) with 4 GPU and batchSize 256. after 50 epochs my top-1 acuracy is %45 . I couldn't find any result of alexnet trained with batchnormalization. Is this number ok? It seems much lower than 57% which is reported in caffe.