PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.1k stars 7.81k forks source link

training acc is 1.0, but get wrong prediction when use tools/infer_rec.py with the same image #2757

Closed billy800413 closed 3 years ago

billy800413 commented 3 years ago

Hi, thanks for the code. When I try to fine-tuning the recognization model (rec_icdar15_train.yml, only modify label_file_list), I get acc is 1.0 on the log file. But when I use tools/infer_rec.py to test with the same image, it gives me the wrong answer.

training log [2021/05/11 13:36:27] root INFO: Architecture : [2021/05/11 13:36:27] root INFO: Backbone : [2021/05/11 13:36:27] root INFO: model_name : large [2021/05/11 13:36:27] root INFO: name : MobileNetV3 [2021/05/11 13:36:27] root INFO: scale : 0.5 [2021/05/11 13:36:27] root INFO: Head : [2021/05/11 13:36:27] root INFO: fc_decay : 0 [2021/05/11 13:36:27] root INFO: name : CTCHead [2021/05/11 13:36:27] root INFO: Neck : [2021/05/11 13:36:27] root INFO: encoder_type : rnn [2021/05/11 13:36:27] root INFO: hidden_size : 96 [2021/05/11 13:36:27] root INFO: name : SequenceEncoder [2021/05/11 13:36:27] root INFO: Transform : None [2021/05/11 13:36:27] root INFO: algorithm : CRNN [2021/05/11 13:36:27] root INFO: model_type : rec [2021/05/11 13:36:27] root INFO: Eval : [2021/05/11 13:36:27] root INFO: dataset : [2021/05/11 13:36:27] root INFO: data_dir : ./train_data/ [2021/05/11 13:36:27] root INFO: label_file_list : ['./train_data/val_list.txt'] [2021/05/11 13:36:27] root INFO: name : SimpleDataSet [2021/05/11 13:36:27] root INFO: transforms : [2021/05/11 13:36:27] root INFO: DecodeImage : [2021/05/11 13:36:27] root INFO: channel_first : False [2021/05/11 13:36:27] root INFO: img_mode : BGR [2021/05/11 13:36:27] root INFO: CTCLabelEncode : None [2021/05/11 13:36:27] root INFO: RecResizeImg : [2021/05/11 13:36:27] root INFO: image_shape : [3, 32, 100] [2021/05/11 13:36:27] root INFO: KeepKeys : [2021/05/11 13:36:27] root INFO: keep_keys : ['image', 'label', 'length'] [2021/05/11 13:36:27] root INFO: loader : [2021/05/11 13:36:27] root INFO: batch_size_per_card : 32 [2021/05/11 13:36:27] root INFO: drop_last : False [2021/05/11 13:36:27] root INFO: num_workers : 4 [2021/05/11 13:36:27] root INFO: shuffle : False [2021/05/11 13:36:27] root INFO: use_shared_memory : False [2021/05/11 13:36:27] root INFO: Global : [2021/05/11 13:36:27] root INFO: cal_metric_during_train : True [2021/05/11 13:36:27] root INFO: character_dict_path : ppocr/utils/ic15_dict.txt [2021/05/11 13:36:27] root INFO: character_type : ch [2021/05/11 13:36:27] root INFO: checkpoints : None [2021/05/11 13:36:27] root INFO: debug : False [2021/05/11 13:36:27] root INFO: distributed : False [2021/05/11 13:36:27] root INFO: epoch_num : 1000 [2021/05/11 13:36:27] root INFO: eval_batch_step : [0, 2000] [2021/05/11 13:36:27] root INFO: infer_img : /home/ubuntu/Desktop/opencvOCR/PaddleOCR/train_data/train/31OCR_20201217-20201221T015245Z-001#Recipe_1_194800399_20201217 030159_0.jpg [2021/05/11 13:36:27] root INFO: infer_mode : False [2021/05/11 13:36:27] root INFO: log_smooth_window : 20 [2021/05/11 13:36:27] root INFO: max_text_length : 15 [2021/05/11 13:36:27] root INFO: pretrained_model : pretrain/rec_icdar15_train/rec_mv3_none_bilstm_ctc_v2.0_train/best_accuracy [2021/05/11 13:36:27] root INFO: print_batch_step : 10 [2021/05/11 13:36:27] root INFO: save_epoch_step : 50 [2021/05/11 13:36:27] root INFO: save_inference_dir : /home/ubuntu/Desktop/opencvOCR/PaddleOCR/output/inference [2021/05/11 13:36:27] root INFO: save_model_dir : ./output/rec/ic15/ [2021/05/11 13:36:27] root INFO: save_res_path : ./output/rec/predicts_ic15.txt [2021/05/11 13:36:27] root INFO: use_gpu : True [2021/05/11 13:36:27] root INFO: use_space_char : False [2021/05/11 13:36:27] root INFO: use_visualdl : True [2021/05/11 13:36:27] root INFO: Loss : [2021/05/11 13:36:27] root INFO: name : CTCLoss [2021/05/11 13:36:27] root INFO: Metric : [2021/05/11 13:36:27] root INFO: main_indicator : acc [2021/05/11 13:36:27] root INFO: name : RecMetric [2021/05/11 13:36:27] root INFO: Optimizer : [2021/05/11 13:36:27] root INFO: beta1 : 0.9 [2021/05/11 13:36:27] root INFO: beta2 : 0.999 [2021/05/11 13:36:27] root INFO: lr : [2021/05/11 13:36:27] root INFO: learning_rate : 0.0005 [2021/05/11 13:36:27] root INFO: name : Adam [2021/05/11 13:36:27] root INFO: regularizer : [2021/05/11 13:36:27] root INFO: factor : 0 [2021/05/11 13:36:27] root INFO: name : L2 [2021/05/11 13:36:27] root INFO: PostProcess : [2021/05/11 13:36:27] root INFO: name : CTCLabelDecode [2021/05/11 13:36:27] root INFO: Train : [2021/05/11 13:36:27] root INFO: dataset : [2021/05/11 13:36:27] root INFO: data_dir : ./train_data/ [2021/05/11 13:36:27] root INFO: label_file_list : ['./train_data/train_list.txt'] [2021/05/11 13:36:27] root INFO: name : SimpleDataSet [2021/05/11 13:36:27] root INFO: transforms : [2021/05/11 13:36:27] root INFO: DecodeImage : [2021/05/11 13:36:27] root INFO: channel_first : False [2021/05/11 13:36:27] root INFO: img_mode : BGR [2021/05/11 13:36:27] root INFO: CTCLabelEncode : None [2021/05/11 13:36:27] root INFO: RecResizeImg : [2021/05/11 13:36:27] root INFO: image_shape : [3, 32, 100] [2021/05/11 13:36:27] root INFO: KeepKeys : [2021/05/11 13:36:27] root INFO: keep_keys : ['image', 'label', 'length'] [2021/05/11 13:36:27] root INFO: loader : [2021/05/11 13:36:27] root INFO: batch_size_per_card : 32 [2021/05/11 13:36:27] root INFO: drop_last : True [2021/05/11 13:36:27] root INFO: num_workers : 8 [2021/05/11 13:36:27] root INFO: shuffle : True [2021/05/11 13:36:27] root INFO: use_shared_memory : False [2021/05/11 13:36:27] root INFO: train with paddle 2.0.2 and device CUDAPlace(0) [2021/05/11 13:36:27] root INFO: Initialize indexs of datasets:['./train_data/train_list.txt'] [2021/05/11 13:36:27] root INFO: Initialize indexs of datasets:['./train_data/val_list.txt'] [2021/05/11 13:36:29] root INFO: load pretrained model from ['pretrain/rec_icdar15_train/rec_mv3_none_bilstm_ctc_v2.0_train/best_accuracy'] [2021/05/11 13:36:29] root INFO: train dataloader has 21 iters [2021/05/11 13:36:29] root INFO: valid dataloader has 22 iters [2021/05/11 13:36:29] root INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations [2021/05/11 13:36:29] root INFO: Initialize indexs of datasets:['./train_data/train_list.txt'] [2021/05/11 13:36:30] root INFO: epoch: [1/1000], iter: 10, lr: 0.000500, loss: 54.146976, acc: 0.031250, norm_edit_dis: 0.260417, reader_cost: 0.01755 s, batch_cost: 0.07407 s, samples: 352, ips: 475.21481 [2021/05/11 13:36:30] root INFO: epoch: [1/1000], iter: 20, lr: 0.000500, loss: 31.921005, acc: 0.093750, norm_edit_dis: 0.422112, reader_cost: 0.00011 s, batch_cost: 0.03816 s, samples: 320, ips: 838.53115 [2021/05/11 13:36:31] root INFO: save model in ./output/rec/ic15/latest ... [2021/05/11 13:37:06] root INFO: epoch: [27/1000], iter: 550, lr: 0.000500, loss: 0.108742, acc: 1.000000, norm_edit_dis: 1.000000, reader_cost: 0.01660 s, batch_cost: 0.04084 s, samples: 160, ips: 391.80245 [2021/05/11 13:37:07] root INFO: epoch: [27/1000], iter: 560, lr: 0.000500, loss: 0.134176, acc: 1.000000, norm_edit_dis: 1.000000, reader_cost: 0.00010 s, batch_cost: 0.03855 s, samples: 320, ips: 830.07241 [2021/05/11 13:37:07] root INFO: epoch: [27/1000], iter: 566, lr: 0.000500, loss: 0.134176, acc: 1.000000, norm_edit_dis: 1.000000, reader_cost: 0.00005 s, batch_cost: 0.02288 s, samples: 192, ips: 839.26994 [2021/05/11 13:37:07] root INFO: save model in ./output/rec/ic15/latest

ddz-mark commented 3 years ago

我也遇到了同样的问题,我只用了7张不同的图片进行文本识别,发现 eval 准确率到了 1.0,但是 infer 的准确率只有 0.42,我通过打印日志后发现,eval 情况下出来结果也是 7 张图片的结果,但是重复了4张,也就是说,实践这3张全部预测对了,导致结果是1.0,但是 infer 的时候有 7 张图片的结果,没有重复,另外4张预测不对,所以总的结果为0.42

ddz-mark commented 3 years ago

请开发人员帮忙看看是什么问题?

hungnv21292 commented 3 years ago

Can you modify character_type (not 'ch') when you run predict?

paddle-bot-old[bot] commented 3 years ago

Since you haven\'t replied for more than 3 months, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于您超过三个月未回复,我们将关闭这个issue/pr。 若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。