meijieru / crnn.pytorch

Convolutional recurrent network in pytorch
MIT License
2.38k stars 658 forks source link

Probelms when fine tuning the models . RuntimeError: inconsistent tensor size #23

Open ahmedmazari-dhatim opened 7 years ago

ahmedmazari-dhatim commented 7 years ago

Hello,

l stuck with fine tuning.

1)First of all to fine tune the model you have to set --nh="256" otherwise it will not work, you'll get this error

( loading pretrained model from /home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth Traceback (most recent call last): File "crnn_main.py", line 98, in crnn.load_state_dict(torch.load(opt.crnn)) File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 335, in load_state_dict ownstate[name].copy(param) RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51 )

because the pretrained model --nh="256" and not 100 as it is set in the default model. But when fine tuning obviously we can change the parameter, so l find it strange that it doesn't work

2) tried different configurations while fine tuning the length of the alphabet , nb_classes= 37 '0123456789abcdefghijklmnopqrstuvwxyz' by default

l tried the following : A) add one letter, let's say Z or another char , . / '0123456789abcdefghijklmnopqrstuvwxyzZ' l got the same error

RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51

B) l removed one char and add another remove z and add / '0123456789abcdefghijklmnopqrstuvwxy/'

l get the same error

RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51

C) l set alphabet only to digits '0123456789'

the same error

RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51

3) train a new model with a varibale length alphabet and number of -nh it works perfectly

Have you any idea for solving the problem of fine tuning to make a variable length of alphabet and the architecture ? Thanks a lot

meijieru commented 7 years ago

Fine-tuning should keep the architecture or part of the architecture. Otherwise, you should train the network from scratch.

For a different alphabet, you should remove the last classify layer and re-add a new one.

ahmedmazari-dhatim commented 7 years ago

@meijieru , thank you for your response. So for different alphabet, l have to remove the last classify layer and and add a new one according to the new alphabet .

Let say that the new alphabet is --alphabet="01234567889abcdefghijklmnopqrstuvwxyz,;:!$@%'°"

to remove the last classify layer and replace by a new one with respect to the new alphabet l do the following :

crnn.pop() # to remove the last classify layer
crnn.append(torch.nn.Linear(len(alphabet), 1))
my_new_classifier = torch.nn.Sequential(*crnn)

Is it in this way we do fine tuning for this model ?

Thank you

stephenrawls commented 7 years ago

Looks like your problem is you are not accounting for the extra output class required by the CTC blank. i.e. change this: crnn.append(torch.nn.Linear(len(alphabet), 1)) to crnn.append(torch.nn.Linear(len(alphabet)+1, 1))

YoungMiao commented 7 years ago

@ahmedmazari-dhatim Have you solved this problem?I tried it,but i did not solve the the classification probelm,thanks a lot

ahmedmazari-dhatim commented 7 years ago

@wulivicte , unfortunately not. l get stuck at it . l'm still trying to find the trick to make this transfer learning works . l let you once l solve it. Please do the same

YoungMiao commented 7 years ago

@ahmedmazari-dhatim Thanks

ahmedmazari-dhatim commented 7 years ago

@wulivicte anything new ?

YoungMiao commented 7 years ago

@ahmedmazari-dhatim yeah,but i need to do some testing

ahmedmazari-dhatim commented 7 years ago

@wulivicte , please share it.

thanks a lot

YoungMiao commented 7 years ago

@ahmedmazari-dhatim you can see it https://github.com/tongpi/basicOCR/tree/master/contrib/crnn

ahmedmazari-dhatim commented 7 years ago

@wulivicte , thank you a lot for your contribution. l just looked after your code, l noticed that you adapted the crnn for Chinese language keys.py and utils.py . But l don't see what you modified to make the transfer learning working , which part of code you modified , you added ? (Question 1)

In transfer learning we have the following scenario :
A- Load the pre-trained model B- Freeze the weights. Remove the last classify layer and and add a new one according to the new alphabet . From my knowledge we have to do something like this :

crnn.pop() # to remove the last classify layer
crnn.append(torch.nn.Linear(len(alphabet)+1, 1))
my_new_classifier = torch.nn.Sequential(*crnn)

Is this what you've done ? (Question 2)

Or you just you added to crnn the possibility to support Chinese language then just do the following :

python crnn_main.py --trainroot="train_data/" --valroot="valid_data" --cuda --adadelta --experiment="save_model/" --crnn="load the pretrained model to make the transfer learning" --alphabet="万下依口哺摄次状璐癌草血运重"

Is that what you have done ? (Question 3)

If yes, then we should totally keep the same architecture of the pre-trained CRNN otherwise we get the following error :

Traceback (most recent call last):
  File "crnn_disk_dur.py", line 104, in <module>
    crnn.load_state_dict(torch.load(opt.crnn))
  File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 335, in load_state_dict
    own_state[name].copy_(param)
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51

When for instance we change the size of lstm parser.add_argument('--nh', type=int, default=256, help='size of the lstm hidden state')

it works only for --nh="256" we can't change this parameter because the pre-trained model is trained with 256 lstm neurons. Furthermore, we can't also touch the architecture of CNN (number of hidden neurons ..) .

Thus l'm wondering how tranfer learning works in CRNN , what the part of crnn is fixed and what the part(layer ) is removed and changed ? (Question 4) @meijieru @wulivicte when we do the following (transfer learning) : python crnn_main.py --trainroot="train_data/" --valroot="valid_data" --cuda --adadelta --experiment="save_model/" --crnn="load the pretrained model to make the transfer learning" --alphabet="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,;:!ù^"

@meijieru , when we add the parameter--crnn="load the pretrained model to make the transfer learning" which part of the code is called to take into consideration fine tuning , freeze the weight of lower layers and change the last layer ?

Thank you a lot for your comments and responses.

Sincerely

YoungMiao commented 7 years ago

@ahmedmazari-dhatim sorry,the *.pth is the OrderedDict file , if do something like this :

crnn.pop() # to remove the last classify layer
crnn.append(torch.nn.Linear(len(alphabet)+1, 1))
my_new_classifier = torch.nn.Sequential(*crnn)

the OrderedDict 's keys like this:

1

i have do this:

 for k,v in pre_trainmodel.items():
        if k in model_dict:
            model_dict[k] = v 

but now can not slove the problem

YoungMiao commented 7 years ago

@ahmedmazari-dhatim you can replace the last the two para ,i have do it and i solve the problem

ahmedmazari-dhatim commented 7 years ago

@wulivicte , thank you for you answer.

1) What do you mean by the *.pth orderedDic file ? is it the print(model_dic) ? 2) If l understand what you're saying . with this code :

 pre_trainmodel = torch.load(opt.crnn)
    model_dict = crnn.state_dict()
    # replace the classfidy layer parameters
   for k,v in model_dict.items():
        if not (k == 'rnn.1.embedding.weight' or k == 'rnn.1.embedding.bias'):
            model_dict[k] = pre_trainmodel[k]

   crnn.load_state_dict(model_dict)
print(crnn)

we are able to apply transfer learning : by retraining the two lstm layers . let's say --nh=170

python crnn_main.py --trainroot="train_data/" --valroot="valid_data" --cuda --adadelta --experiment="save_model/" --crnn="load the pretrained model to make the transfer learning" --alphabet="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,;:!ù^" --nh="170" Isn't it ?

Thank you a lot @wulivicte

YoungMiao commented 7 years ago

@ahmedmazari-dhatim the train and pre-training model must be the same -nh , i use the-nh = 256 and use the author's model as a pre-training model

ahmedmazari-dhatim commented 7 years ago

Hi, @wulivicte @meijieru

What is the difference between transfer learning with the code of @meijieru as follow :

python2 crnn_main.py --trainroot="train_data/" --valroot="valid_data/" --cuda --adadelta --experiment="sotr_model/" --crnn="data/crnn.pth"

and the code of @wulivicte when you add

 pre_trainmodel = torch.load(opt.crnn)
    model_dict = crnn.state_dict()
    # replace the classfidy layer parameters
   for k,v in model_dict.items():
        if not (k == 'rnn.1.embedding.weight' or k == 'rnn.1.embedding.bias'):
            model_dict[k] = pre_trainmodel[k]

   crnn.load_state_dict(model_dict)
print(crnn)

Then :

python2 crnn_main.py --trainroot="train_data/" --valroot="valid_data/" --cuda --adadelta --experiment="sotr_model/" --crnn="data/crnn.pth"

?

Thank you