soumith / imagenet-multiGPU.torch

an imagenet example in torch.
BSD 2-Clause "Simplified" License
401 stars 158 forks source link

Newbie question: Is my model making any progress #70

Closed livenletdie closed 8 years ago

livenletdie commented 8 years ago

Sorry if this is a stupid question - this is my first time training on imagenet. In the past, I only worked on smaller datasets such as cifar10 where few iterations through the dataset got to 60% accuracy.

After 10 epochs I still see that the Top-1 error is close to 0 for most of the batches (see log below). The error is still at 6.9% and not really going down. Is my model actually learning and just slow because imagenet is too huge or something wrong with my setup? Any help is appreciated.

Epoch: [11][1383/10000] Time 0.675 Err 6.9076 Top1-%: 0.00 LR 1e-02 DataLoadingTime 0.015 Epoch: [11][1384/10000] Time 0.675 Err 6.9091 Top1-%: 0.00 LR 1e-02 DataLoadingTime 0.014 Epoch: [11][1385/10000] Time 0.672 Err 6.9048 Top1-%: 1.56 LR 1e-02 DataLoadingTime 0.016 Epoch: [11][1386/10000] Time 0.678 Err 6.9069 Top1-%: 0.00 LR 1e-02 DataLoadingTime 0.014 Epoch: [11][1387/10000] Time 0.674 Err 6.9074 Top1-%: 0.00 LR 1e-02 DataLoadingTime 0.016 Epoch: [11][1388/10000] Time 0.672 Err 6.9080 Top1-%: 0.00 LR 1e-02 DataLoadingTime 0.014 Epoch: [11][1389/10000] Time 0.674 Err 6.9067 Top1-%: 1.56 LR 1e-02 DataLoadingTime 0.015 Epoch: [11][1390/10000] Time 0.675 Err 6.9071 Top1-%: 0.00 LR 1e-02 DataLoadingTime 0.015 Epoch: [11][1391/10000] Time 0.674 Err 6.9045 Top1-%: 0.00 LR 1e-02 DataLoadingTime 0.015

foelin commented 8 years ago

no..something is wrong..In my training, after 5 or 6 epochs, the top1 error on testing set goes to 3.1%, and it goes to about 30% on training set

livenletdie commented 8 years ago

Thanks for the reply. It is so weird - for alexnet and its variants, it is converging but not for Googlenet. I am not sure what is going wrong - anyways thanks for the response.