clovaai / deep-text-recognition-benchmark

Text recognition (optical character recognition) with deep learning methods, ICCV 2019
Apache License 2.0
3.77k stars 1.11k forks source link

Saved model names may be wrong? issue while fine tuning #292

Open ezzaimsoufiane opened 3 years ago

ezzaimsoufiane commented 3 years ago

Hello,

As many people referenced before me there is an issue while fine tuning the saved model on your own dataset #73 #275 #145 There were multiple solutions suggested:

But the problem persists

loading pretrained model from saved_models/TPS-ResNet-BiLSTM-Attn.pth
Traceback (most recent call last):
  File "/home/soufiane/neovision/deep-text-org/train.py", line 321, in <module>
    train(opt)
  File "/home/soufiane/neovision/deep-text-org/train.py", line 84, in train
    model.load_state_dict(torch.load(opt.saved_model), strict=False)
  File "/home/soufiane/anaconda3/envs/clova/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1224, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
    size mismatch for module.Prediction.attention_cell.rnn.weight_ih: copying a param with shape torch.Size([1024, 294]) from checkpoint, the shape in current model is torch.Size([1024, 280]).
    size mismatch for module.Prediction.generator.weight: copying a param with shape torch.Size([38, 256]) from checkpoint, the shape in current model is torch.Size([24, 256]).
    size mismatch for module.Prediction.generator.bias: copying a param with shape torch.Size([38]) from checkpoint, the shape in current model is torch.Size([24]).

Process finished with exit code 1

Now here is the curious part The issue only comes up if you correctly retrain a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn model with TPS-ResNet-BiLSTM-Attn.pth And also when you train --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction CTC model with TPS-ResNet-BiLSTM-CTC.pth

BUT: When you switch CTC with Attn and train a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn model with TPS-ResNet-BiLSTM-CTC.pth and when you train a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction CTC model with TPS-ResNet-BiLSTM-Attn.pth

the issue disappreas and the training actually starts!!

My best guess is that you confused TPS-ResNet-BiLSTM-Attn.pth and TPS-ResNet-BiLSTM-CTC.pth naming and that TPS-ResNet-BiLSTM-Attn.pth is actually the model for TPS-ResNet-BiLSTM-CTC

cc: @ku21fan

Treeboy2762 commented 3 years ago

https://github.com/clovaai/deep-text-recognition-benchmark/issues/210#issuecomment-883112527

Can you refer to this issue and see if the problem yet persists?