clovaai / deep-text-recognition-benchmark

Text recognition (optical character recognition) with deep learning methods, ICCV 2019
Apache License 2.0
3.77k stars 1.11k forks source link

demo.py ignores characters the weights were trained on #294

Open HSILA opened 3 years ago

HSILA commented 3 years ago

Hi First I fine-tuned the None-VGG-BiLSTM-CTC on pre-trained weights with these characters: +-/0123456789ABCDEFGHJKLMNPQRSTUVWYZ

 CUDA_VISIBLE_DEVICES=0 python3 train.py --train_data result/train --valid_data result/val \
 --Transformation None --FeatureExtraction VGG --SequenceModeling BiLSTM --Prediction CTC \
 --data_filtering_off --num_iter 20000 --valInterval 200 --FT --saved_model None-VGG-BiLSTM-CTC.pth \
--imgH 224 --imgW 500 --PAD

I have a test set (besides eval) and I want to test the model's performance. When I use demo.py for inference and hard-code the above characters into demo.py and run:

CUDA_VISIBLE_DEVICES=0 python3 demo.py --Transformation None --FeatureExtraction VGG \
--SequenceModeling BiLSTM --Prediction CTC --image_folder mydata/TEST-SET \
--saved_model saved_models/None-VGG-BiLSTM-CTC-20K/best_accuracy.pth

The log is saved on log_demo_result.txt. There's an odd observation:

I fine-tuned on uppercase characters and changed --character in both demo.py and train.py but most of the predicted labels contain lowercase characters. Can anybody explain why?

HSILA commented 3 years ago

I tried training from scratch without fine-tuning (same characters). Again when I try to infer with demo.py, it just produces numerical predictions, while my character list contains letters too. Is it a bug or am I doing something wrong?