Thank you for providing this repository and the pre-trained models. I have been trying to reproduce the results from the paper "What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis" using the provided best model, but I'm not getting the same results.
Steps Taken:
I followed the instructions provided in the README.
I used the TPS-ResNet-BiLTSM-Attn mentioned in the paper. and this is my configuration for testing your model.
sys.argv = [
'colab_kernel_launcher.py',
'--eval_data', '/content/drive/MyDrive/dtr_bechmark/evaluation',
'--benchmark_all_eval',
'--workers', '4',
'--batch_size', '192',
'--saved_model', '/content/drive/MyDrive/dtr_bechmark/model/TPS-ResNet-BiLSTM-Attn.pth',
'--batch_max_length', '25',
'--imgH', '32',
'--imgW', '100',
'--character', '0123456789abcdefghijklmnopqrstuvwxyz',
'--PAD',
'--Transformation', 'TPS',
'--FeatureExtraction', 'ResNet',
'--SequenceModeling', 'BiLSTM',
'--Prediction', 'Attn',
'--num_fiducial', '20',
'--input_channel', '1',
'--output_channel', '512',
'--hidden_size', '256',
'--sensitive'
]
Expected Results:
I expected to see results similar to those reported in the paper
Actual Results:
However, the results I obtained were significantly different. I have attached a screenshot for reference.
Additional Information:
I use Google colab environtment with no GPU for testing the best model, and python version is 3.10.12, then torch version is 2.3.4+cu121
and this is my colab link https://colab.research.google.com/drive/1TrczF8gUHfWxsL1_-dwmzR9OY11npA7o?usp=sharing
Could you please provide some guidance or suggestions on what might be going wrong? Is there a possibility that I'm missing something critical?
Thank you for your help and for your contributions to the community!
Hello,
Thank you for providing this repository and the pre-trained models. I have been trying to reproduce the results from the paper "What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis" using the provided best model, but I'm not getting the same results.
Steps Taken:
I followed the instructions provided in the README. I used the TPS-ResNet-BiLTSM-Attn mentioned in the paper. and this is my configuration for testing your model. sys.argv = [ 'colab_kernel_launcher.py', '--eval_data', '/content/drive/MyDrive/dtr_bechmark/evaluation', '--benchmark_all_eval', '--workers', '4', '--batch_size', '192', '--saved_model', '/content/drive/MyDrive/dtr_bechmark/model/TPS-ResNet-BiLSTM-Attn.pth', '--batch_max_length', '25', '--imgH', '32', '--imgW', '100', '--character', '0123456789abcdefghijklmnopqrstuvwxyz', '--PAD', '--Transformation', 'TPS', '--FeatureExtraction', 'ResNet', '--SequenceModeling', 'BiLSTM', '--Prediction', 'Attn', '--num_fiducial', '20', '--input_channel', '1', '--output_channel', '512', '--hidden_size', '256',
'--sensitive'
] Expected Results: I expected to see results similar to those reported in the paper
Actual Results: However, the results I obtained were significantly different. I have attached a screenshot for reference.
Additional Information: I use Google colab environtment with no GPU for testing the best model, and python version is 3.10.12, then torch version is 2.3.4+cu121 and this is my colab link https://colab.research.google.com/drive/1TrczF8gUHfWxsL1_-dwmzR9OY11npA7o?usp=sharing Could you please provide some guidance or suggestions on what might be going wrong? Is there a possibility that I'm missing something critical?
Thank you for your help and for your contributions to the community!
Best regards, Fahmy Nadhif