facebookarchive / fb.resnet.torch

Torch implementation of ResNet from http://arxiv.org/abs/1512.03385 and training scripts
Other
2.29k stars 664 forks source link

Accuracy is not reached #164

Open ilichev-andrey opened 7 years ago

ilichev-andrey commented 7 years ago

Hello.

Single-crop (224x224) validation error rate:

Network     Top-1 error     Top-5 error
ResNet-18   30.43           10.76

My validation error:

epoch   Top1    Top5
60  54.493  29.503
61  48.537  24.111
62  47.801  23.614
63  47.553  23.461
64  47.553  23.461
65  49.006  25.029
66  48.260  24.312
67  49.350  25.392
68  46.969  23.069
69  46.759  22.772
70  48.088  24.120
71  46.721  22.878
72  49.771  25.554
73  49.598  25.373
74  49.895  25.889
75  46.491  22.935
76  46.491  22.935
77  48.040  24.264
78  46.902  23.059
79  46.902  23.059
80  46.520  23.155
81  46.520  23.155
82  46.520  23.155
83  46.520  23.155
84  46.520  23.155
85  46.520  23.155
86  46.520  23.155
87  46.520  23.155
88  46.520  23.155
89  46.520  23.155
90  48.929  24.321

I used imageNet 1046 classes: 1000 classes - default imageNet 2012 46 classes - my classes

dataset:

train:
    class1 - 150 images
    class2 - 150 images
    class3 - 150 images
    ...
    class1046 - 150 images

val:
    class1 - 10 images
    class2 - 10 images
    class3 - 10 images
    ...
    class1046 - 10 images

Please tell me why I was not able to achieve the specified accuracy?

aabobakr commented 7 years ago

From the size of your dataset, I assume that you are fine-tuning with changing the last layer only, so try with lower learning rate as the default 0.1 would be too high in this case, also try with setting the learning rate decay.

ilichev-andrey commented 7 years ago

I run: th main.lua -depth 18 -batchSize 24 -data [path to dataset] changed: model:add(nn.Linear(nFeatures, 1000)) -> model:add(nn.Linear(nFeatures, 1046)) in https://github.com/facebook/fb.resnet.torch/blob/master/models/resnet.lua#L128

I did not do so: th main.lua -retrain resnet-18.t7 -data [path-to-directory-with-train-and-val] -resetClassifier true -nClasses 1046

Start with learning rate 0.01? what will increase or decrease: -weightDecay', 1e-4, 'weight decay'? How much weight decay should I choose?

Thanks.

aabobakr commented 7 years ago

By running this command, you are training the model from scratch on much smaller dataset than imagenet, so you will not reach the same performance, and with this size your model will most likely overfit the training set.

I recommend that you yo do fine-tuning instead. 1- model:add(nn.Linear(nFeatures, 1000)) -> model:add(nn.Linear(nFeatures, 1046)) 2- th main.lua -retrain resnet-18.t7 -data [path-to-directory-with-train-and-val] -resetClassifier true -nClasses 1046 -LR 0.001

ilichev-andrey commented 7 years ago

Thanks, I will try.

ajeetksingh commented 7 years ago

How did the training go? @ilichev-andrey Were you able to solve your problem?

ilichev-andrey commented 7 years ago

Yes, it really works. For 3 epochs I have reached a mistake in less than 90 epochs. epoch: 10, top 1: 37.635, top 5: 15.321