As many people referenced before me there is an issue while fine tuning the saved model on your own dataset #73 #275 #145
There were multiple solutions suggested:
@ku21fan suggested commenting out the lines for fine-tuning in transformation.py
using --sensitive
But the problem persists
loading pretrained model from saved_models/TPS-ResNet-BiLSTM-Attn.pth
Traceback (most recent call last):
File "/home/soufiane/neovision/deep-text-org/train.py", line 321, in <module>
train(opt)
File "/home/soufiane/neovision/deep-text-org/train.py", line 84, in train
model.load_state_dict(torch.load(opt.saved_model), strict=False)
File "/home/soufiane/anaconda3/envs/clova/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1224, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
size mismatch for module.Prediction.attention_cell.rnn.weight_ih: copying a param with shape torch.Size([1024, 294]) from checkpoint, the shape in current model is torch.Size([1024, 280]).
size mismatch for module.Prediction.generator.weight: copying a param with shape torch.Size([38, 256]) from checkpoint, the shape in current model is torch.Size([24, 256]).
size mismatch for module.Prediction.generator.bias: copying a param with shape torch.Size([38]) from checkpoint, the shape in current model is torch.Size([24]).
Process finished with exit code 1
Now here is the curious part
The issue only comes up if you correctly retrain a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn model with TPS-ResNet-BiLSTM-Attn.pth
And also when you train --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction CTC model with TPS-ResNet-BiLSTM-CTC.pth
BUT:
When you switch CTC with Attn and train a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn model with TPS-ResNet-BiLSTM-CTC.pth
and when you train a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction CTC model with TPS-ResNet-BiLSTM-Attn.pth
the issue disappreas and the training actually starts!!
My best guess is that you confused TPS-ResNet-BiLSTM-Attn.pth and TPS-ResNet-BiLSTM-CTC.pth naming
and that TPS-ResNet-BiLSTM-Attn.pth is actually the model for TPS-ResNet-BiLSTM-CTC
Hello,
As many people referenced before me there is an issue while fine tuning the saved model on your own dataset #73 #275 #145 There were multiple solutions suggested:
--sensitive
But the problem persists
Now here is the curious part The issue only comes up if you correctly retrain a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn model with TPS-ResNet-BiLSTM-Attn.pth And also when you train --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction CTC model with TPS-ResNet-BiLSTM-CTC.pth
BUT: When you switch CTC with Attn and train a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn model with TPS-ResNet-BiLSTM-CTC.pth and when you train a --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction CTC model with TPS-ResNet-BiLSTM-Attn.pth
the issue disappreas and the training actually starts!!
My best guess is that you confused
TPS-ResNet-BiLSTM-Attn.pth
andTPS-ResNet-BiLSTM-CTC.pth
naming and that TPS-ResNet-BiLSTM-Attn.pth is actually the model for TPS-ResNet-BiLSTM-CTCcc: @ku21fan