Open ahmedmazari-dhatim opened 7 years ago
Fine-tuning should keep the architecture or part of the architecture. Otherwise, you should train the network from scratch.
For a different alphabet, you should remove the last classify layer and re-add a new one.
@meijieru , thank you for your response. So for different alphabet, l have to remove the last classify layer and and add a new one according to the new alphabet .
Let say that the new alphabet is --alphabet="01234567889abcdefghijklmnopqrstuvwxyz,;:!$@%'°"
to remove the last classify layer and replace by a new one with respect to the new alphabet l do the following :
crnn.pop() # to remove the last classify layer
crnn.append(torch.nn.Linear(len(alphabet), 1))
my_new_classifier = torch.nn.Sequential(*crnn)
Is it in this way we do fine tuning for this model ?
Thank you
Looks like your problem is you are not accounting for the extra output class required by the CTC blank. i.e. change this:
crnn.append(torch.nn.Linear(len(alphabet), 1))
to
crnn.append(torch.nn.Linear(len(alphabet)+1, 1))
@ahmedmazari-dhatim Have you solved this problem?I tried it,but i did not solve the the classification probelm,thanks a lot
@wulivicte , unfortunately not. l get stuck at it . l'm still trying to find the trick to make this transfer learning works . l let you once l solve it. Please do the same
@ahmedmazari-dhatim Thanks
@wulivicte anything new ?
@ahmedmazari-dhatim yeah,but i need to do some testing
@wulivicte , please share it.
thanks a lot
@ahmedmazari-dhatim you can see it https://github.com/tongpi/basicOCR/tree/master/contrib/crnn
@wulivicte , thank you a lot for your contribution. l just looked after your code, l noticed that you adapted the crnn for Chinese language keys.py
and utils.py
. But l don't see what you modified to make the transfer learning working , which part of code you modified , you added ? (Question 1)
In transfer learning we have the following scenario :
A- Load the pre-trained model
B- Freeze the weights. Remove the last classify layer and and add a new one according to the new alphabet .
From my knowledge we have to do something like this :
crnn.pop() # to remove the last classify layer
crnn.append(torch.nn.Linear(len(alphabet)+1, 1))
my_new_classifier = torch.nn.Sequential(*crnn)
Is this what you've done ? (Question 2)
Or you just you added to crnn the possibility to support Chinese language then just do the following :
python crnn_main.py --trainroot="train_data/" --valroot="valid_data" --cuda --adadelta --experiment="save_model/" --crnn="load the pretrained model to make the transfer learning" --alphabet="万下依口哺摄次状璐癌草血运重"
Is that what you have done ? (Question 3)
If yes, then we should totally keep the same architecture of the pre-trained CRNN otherwise we get the following error :
Traceback (most recent call last):
File "crnn_disk_dur.py", line 104, in <module>
crnn.load_state_dict(torch.load(opt.crnn))
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 335, in load_state_dict
own_state[name].copy_(param)
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
When for instance we change the size of lstm
parser.add_argument('--nh', type=int, default=256, help='size of the lstm hidden state')
it works only for --nh="256"
we can't change this parameter because the pre-trained model is trained with 256 lstm neurons. Furthermore, we can't also touch the architecture of CNN (number of hidden neurons ..) .
Thus l'm wondering how tranfer learning works in CRNN , what the part of crnn is fixed and what the part(layer ) is removed and changed ? (Question 4)
@meijieru @wulivicte
when we do the following (transfer learning) :
python crnn_main.py --trainroot="train_data/" --valroot="valid_data" --cuda --adadelta --experiment="save_model/" --crnn="load the pretrained model to make the transfer learning" --alphabet="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,;:!ù^"
@meijieru , when we add the parameter--crnn="load the pretrained model to make the transfer learning"
which part of the code is called to take into consideration fine tuning , freeze the weight of lower layers and change the last layer ?
Thank you a lot for your comments and responses.
Sincerely
@ahmedmazari-dhatim sorry,the *.pth is the OrderedDict file , if do something like this :
crnn.pop() # to remove the last classify layer
crnn.append(torch.nn.Linear(len(alphabet)+1, 1))
my_new_classifier = torch.nn.Sequential(*crnn)
the OrderedDict 's keys like this:
i have do this:
for k,v in pre_trainmodel.items():
if k in model_dict:
model_dict[k] = v
but now can not slove the problem
@ahmedmazari-dhatim you can replace the last the two para ,i have do it and i solve the problem
@wulivicte , thank you for you answer.
1) What do you mean by the *.pth orderedDic file ? is it the print(model_dic)
?
2) If l understand what you're saying . with this code :
pre_trainmodel = torch.load(opt.crnn)
model_dict = crnn.state_dict()
# replace the classfidy layer parameters
for k,v in model_dict.items():
if not (k == 'rnn.1.embedding.weight' or k == 'rnn.1.embedding.bias'):
model_dict[k] = pre_trainmodel[k]
crnn.load_state_dict(model_dict)
print(crnn)
we are able to apply transfer learning : by retraining the two lstm layers . let's say --nh=170
python crnn_main.py --trainroot="train_data/" --valroot="valid_data" --cuda --adadelta --experiment="save_model/" --crnn="load the pretrained model to make the transfer learning" --alphabet="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,;:!ù^" --nh="170"
Isn't it ?
Thank you a lot @wulivicte
@ahmedmazari-dhatim the train and pre-training model must be the same -nh , i use the-nh = 256
and use the author's model as a pre-training model
Hi, @wulivicte @meijieru
What is the difference between transfer learning with the code of @meijieru as follow :
python2 crnn_main.py --trainroot="train_data/" --valroot="valid_data/" --cuda --adadelta --experiment="sotr_model/" --crnn="data/crnn.pth"
and the code of @wulivicte when you add
pre_trainmodel = torch.load(opt.crnn)
model_dict = crnn.state_dict()
# replace the classfidy layer parameters
for k,v in model_dict.items():
if not (k == 'rnn.1.embedding.weight' or k == 'rnn.1.embedding.bias'):
model_dict[k] = pre_trainmodel[k]
crnn.load_state_dict(model_dict)
print(crnn)
Then :
python2 crnn_main.py --trainroot="train_data/" --valroot="valid_data/" --cuda --adadelta --experiment="sotr_model/" --crnn="data/crnn.pth"
?
Thank you
Hello,
l stuck with fine tuning.
1)First of all to fine tune the model you have to set --nh="256" otherwise it will not work, you'll get this error
( loading pretrained model from /home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth Traceback (most recent call last): File "crnn_main.py", line 98, in
crnn.load_state_dict(torch.load(opt.crnn))
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 335, in load_state_dict
ownstate[name].copy(param)
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
)because the pretrained model --nh="256" and not 100 as it is set in the default model. But when fine tuning obviously we can change the parameter, so l find it strange that it doesn't work
2) tried different configurations while fine tuning the length of the alphabet , nb_classes= 37 '0123456789abcdefghijklmnopqrstuvwxyz' by default
l tried the following : A) add one letter, let's say Z or another char , . / '0123456789abcdefghijklmnopqrstuvwxyzZ' l got the same error
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
B) l removed one char and add another remove z and add / '0123456789abcdefghijklmnopqrstuvwxy/'
l get the same error
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
C) l set alphabet only to digits '0123456789'
the same error
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
3) train a new model with a varibale length alphabet and number of -nh it works perfectly
Have you any idea for solving the problem of fine tuning to make a variable length of alphabet and the architecture ? Thanks a lot