roatienza / deep-text-recognition-benchmark

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Apache License 2.0
284 stars 57 forks source link

model state loading issue #11

Closed rouarouatbi closed 2 years ago

rouarouatbi commented 2 years ago

I tried to rerun the model with the vitstr tiny version weights but I got Missing and Unexpected key(s) in state_dict issues while loading the model state.

roatienza commented 2 years ago

Pls post the exact error. Thanks.

rouarouatbi commented 2 years ago

RuntimeError: Error(s) in loading state_dict for Model: Missing key(s) in state_dict: "Transformation.LocalizationNetwork.conv.0.weight", "Transformation.LocalizationNetwork.conv.1.weight", "Transformation.LocalizationNetwork.conv.1.bias", "Transformation.LocalizationNetwork.conv.1.running_mean", "Transformation.LocalizationNetwork.conv.1.running_var", "Transformation.LocalizationNetwork.conv.4.weight", "Transformation.LocalizationNetwork.conv.5.weight", "Transformation.LocalizationNetwork.conv.5.bias", "Transformation.LocalizationNetwork.conv.5.running_mean", "Transformation.LocalizationNetwork.conv.5.running_var", "Transformation.LocalizationNetwork.conv.8.weight", "Transformation.LocalizationNetwork.conv.9.weight", "Transformation.LocalizationNetwork.conv.9.bias", "Transformation.LocalizationNetwork.conv.9.running_mean", "Transformation.LocalizationNetwork.conv.9.running_var", "Transformation.LocalizationNetwork.conv.12.weight", "Transformation.LocalizationNetwork.conv.13.weight", "Transformation.LocalizationNetwork.conv.13.bias", "Transformation.LocalizationNetwork.conv.13.running_mean", "Transformation.LocalizationNetwork.conv.13.running_var", "Transformation.LocalizationNetwork.localization_fc1.0.weight", "Transformation.LocalizationNetwork.localization_fc1.0.bias", "Transformation.LocalizationNetwork.localization_fc2.weight", "Transformation.LocalizationNetwork.localization_fc2.bias", "Transformation.GridGenerator.inv_delta_C", "Transformation.GridGenerator.P_hat", "FeatureExtraction.ConvNet.0.weight", "FeatureExtractio... Unexpected key(s) in state_dict: "module.vitstr.cls_token", "module.vitstr.pos_embed", "module.vitstr.patch_embed.proj.weight", "module.vitstr.patch_embed.proj.bias", "module.vitstr.blocks.0.norm1.weight", "module.vitstr.blocks.0.norm1.bias", "module.vitstr.blocks.0.attn.qkv.weight", "module.vitstr.blocks.0.attn.qkv.bias", "module.vitstr.blocks.0.attn.proj.weight", "module.vitstr.blocks.0.attn.proj.bias", "module.vitstr.blocks.0.norm2.weight", "module.vitstr.blocks.0.norm2.bias", "module.vitstr.blocks.0.mlp.fc1.weight", "module.vitstr.blocks.0.mlp.fc1.bias", "module.vitstr.blocks.0.mlp.fc2.weight", "module.vitstr.blocks.0.mlp.fc2.bias", "module.vitstr.blocks.1.norm1.weight", "module.vitstr.blocks.1.norm1.bias", "module.vitstr.blocks.1.attn.qkv.weight", "module.vitstr.blocks.1.attn.qkv.bias", "module.vitstr.blocks.1.attn.proj.weight", "module.vitstr.blocks.1.attn.proj.bias", "module.vitstr.blocks.1.norm2.weight", "module.vitstr.blocks.1.norm2.bias", "module.vitstr.blocks.1.mlp.fc1.weight", "module.vitstr.blocks.1.mlp.fc1.bias", "module.vitstr.blocks.1.mlp.fc2.weight", "module.vitstr.blocks.1.mlp.fc2.bias", "module.vitstr.blocks.2.norm1.weight", "module.vitstr.blocks.2.norm1.bias", "module.vitstr.blocks.2.attn.qkv.weight", "module.vitstr.blocks.2.attn.qkv.bias", "module.vitstr.blocks.2.attn.proj.weight", "module.vitstr.blocks.2.attn.proj.bias", "module.vitstr.blocks.2.norm2.weight", "module.vitstr.blocks.2.norm2.bias", "module.vitstr.blocks.2.mlp.fc1.weight", "module.vitstr...

roatienza commented 2 years ago

Pls check if you are using the right command line option and pth file. I just checked, all ok.

CUDA_VISIBLE_DEVICES=0 python3 test.py --eval_data data_lmdb_release/evaluation  --benchmark_all_eval --Transformation None --FeatureExtraction None  --SequenceModeling None --Prediction None --Transformer --sensitive --data_filtering_off  --imgH 224 --imgW 224 --TransformerModel=vitstr_tiny_patch16_224 --saved_model  /home/rowel/Downloads/vitstr_tiny_patch16_224.pth
....
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/IC03_860   dataset: /
sub-directory:  /.       num samples: 860
Acc 93.023       normalized_ED 0.973
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/IC03_867   dataset: /
sub-directory:  /.       num samples: 867
Acc 92.618       normalized_ED 0.972
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/IC13_857   dataset: /
sub-directory:  /.       num samples: 857
Acc 90.432       normalized_ED 0.972
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/IC13_1015  dataset: /
sub-directory:  /.       num samples: 1015
Acc 89.163       normalized_ED 0.952
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/IC15_1811  dataset: /
sub-directory:  /.       num samples: 1811
Acc 72.501       normalized_ED 0.898
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/IC15_2077  dataset: /
sub-directory:  /.       num samples: 2077
Acc 67.068       normalized_ED 0.856
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/SVTP       dataset: /
sub-directory:  /.       num samples: 645
Acc 74.574       normalized_ED 0.896
--------------------------------------------------------------------------------
dataset_root:    data_lmdb_release/evaluation/CUTE80     dataset: /
sub-directory:  /.       num samples: 288
Acc 66.667       normalized_ED 0.837
--------------------------------------------------------------------------------
accuracy: IIIT5k_3000: 83.833   SVT: 82.689     IC03_860: 93.023        IC03_867: 92.618        IC13_857: 90.432        IC13_1015: 89.163       IC15_1811: 72.501       IC15_2077: 67.068 SVTP: 74.574    CUTE80: 66.667  total_accuracy: 80.484  averaged_infer_time: 0.099      # parameters: 5.445