roatienza / deep-text-recognition-benchmark

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Apache License 2.0
287 stars 59 forks source link

Available Model weights. #36

Open schreiterjp opened 1 year ago

schreiterjp commented 1 year ago

Hi, thanks for the nice work. I'm trying to get the available model weights for vitstr_base_patch16_224_aug to work with the infer.py script. So far it is not working, because the model is not build properly. Could you please give me an advice how to load the model pretrained from given checkpoint? Thanks.

roatienza commented 1 year ago

I think I only made tiny models available on jit. For other models, pls use the test example like:

CUDA_VISIBLE_DEVICES=0 python3 test.py --eval_data data_lmdb_release/evaluation \
--benchmark_all_eval --Transformation None --FeatureExtraction None \
--SequenceModeling None --Prediction None --Transformer \
--TransformerModel=vitstr_tiny_patch16_224 \
--sensitive --data_filtering_off  --imgH 224 --imgW 224 \
--saved_model <path_to/best_accuracy.pth>