识别模型CRNN从训练模型转预测模型，部分图片预测结果不一致

gdwu1427 commented 2 years ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：ubuntu 16.04
版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components：
paddleocr：2.0.1；paddlepaddle-gpu：2.2.2；PaddleOCR:release 2.4
运行指令/Command Code：python3 tools/infer_rec.py 、`python tools/infer/predict_rec.py
完整报错/Complete Error Message：

您好！请教一下，识别模型CRNN从训练模型转预测模型后部分图片结果识别错，有可能是什么原因，谢谢，用的是最新版本的paddleocr infer_rec.py识别结果是对的，predict_rec.py脚本识别错误

python3 tools/infer_rec.py： python3 tools/infer_rec.py -c /home/znlh/dnn/gdwu/PaddleOCR/configs/rec/rec_icdar15_train.yml -o Global.infer_img="/home/znlh/dnn/gdwu/PaddleOCR/train_data/ic15_data/test/qd_021758_2.jpg" Global.checkpoints="/home/znlh/dnn/gdwu/PaddleOCR/output/rec/ic15/best_accuracy" [2022/02/18 10:56:01] root INFO: Architecture : [2022/02/18 10:56:01] root INFO: Backbone : [2022/02/18 10:56:01] root INFO: model_name : large [2022/02/18 10:56:01] root INFO: name : MobileNetV3 [2022/02/18 10:56:01] root INFO: scale : 0.5 [2022/02/18 10:56:01] root INFO: Head : [2022/02/18 10:56:01] root INFO: fc_decay : 0 [2022/02/18 10:56:01] root INFO: name : CTCHead [2022/02/18 10:56:01] root INFO: Neck : [2022/02/18 10:56:01] root INFO: encoder_type : rnn [2022/02/18 10:56:01] root INFO: hidden_size : 96 [2022/02/18 10:56:01] root INFO: name : SequenceEncoder [2022/02/18 10:56:01] root INFO: Transform : None [2022/02/18 10:56:01] root INFO: algorithm : CRNN [2022/02/18 10:56:01] root INFO: model_type : rec [2022/02/18 10:56:01] root INFO: Eval : [2022/02/18 10:56:01] root INFO: dataset : [2022/02/18 10:56:01] root INFO: data_dir : ./train_data/ic15_data/ [2022/02/18 10:56:01] root INFO: label_file_list : ['./train_data/ic15_data/rec_gt_test.txt'] [2022/02/18 10:56:01] root INFO: name : SimpleDataSet [2022/02/18 10:56:01] root INFO: transforms : [2022/02/18 10:56:01] root INFO: DecodeImage : [2022/02/18 10:56:01] root INFO: channel_first : False [2022/02/18 10:56:01] root INFO: img_mode : BGR [2022/02/18 10:56:01] root INFO: CTCLabelEncode : None [2022/02/18 10:56:01] root INFO: RecResizeImg : [2022/02/18 10:56:01] root INFO: image_shape : [3, 32, 100] [2022/02/18 10:56:01] root INFO: KeepKeys : [2022/02/18 10:56:01] root INFO: keep_keys : ['image', 'label', 'length'] [2022/02/18 10:56:01] root INFO: loader : [2022/02/18 10:56:01] root INFO: batch_size_per_card : 256 [2022/02/18 10:56:01] root INFO: drop_last : False [2022/02/18 10:56:01] root INFO: num_workers : 4 [2022/02/18 10:56:01] root INFO: shuffle : False [2022/02/18 10:56:01] root INFO: use_shared_memory : False [2022/02/18 10:56:01] root INFO: Global : [2022/02/18 10:56:01] root INFO: cal_metric_during_train : True [2022/02/18 10:56:01] root INFO: character_dict_path : /home/znlh/dnn/gdwu/PaddleOCR/ppocr/utils/IC15_dict.txt [2022/02/18 10:56:01] root INFO: checkpoints : /home/znlh/dnn/gdwu/PaddleOCR/output/rec/ic15/best_accuracy [2022/02/18 10:56:01] root INFO: debug : False [2022/02/18 10:56:01] root INFO: distributed : False [2022/02/18 10:56:01] root INFO: epoch_num : 2000 [2022/02/18 10:56:01] root INFO: eval_batch_step : [0, 2000] [2022/02/18 10:56:01] root INFO: infer_img : /home/znlh/dnn/gdwu/PaddleOCR/train_data/ic15_data/test/qd_021758_2.jpg [2022/02/18 10:56:01] root INFO: infer_mode : False [2022/02/18 10:56:01] root INFO: log_smooth_window : 20 [2022/02/18 10:56:01] root INFO: max_text_length : 25 [2022/02/18 10:56:01] root INFO: pretrained_model : /home/znlh/dnn/gdwu/PaddleOCR/pretrain_models/rec_mv3_none_bilstm_ctc_v2.0_train/best_accuracy [2022/02/18 10:56:01] root INFO: print_batch_step : 50 [2022/02/18 10:56:01] root INFO: save_epoch_step : 20 [2022/02/18 10:56:01] root INFO: save_inference_dir : ./ [2022/02/18 10:56:01] root INFO: save_model_dir : ./output/rec/ic15/ [2022/02/18 10:56:01] root INFO: save_res_path : ./output1/rec/predicts_ic15.txt [2022/02/18 10:56:01] root INFO: use_gpu : True [2022/02/18 10:56:01] root INFO: use_space_char : False [2022/02/18 10:56:01] root INFO: use_visualdl : False [2022/02/18 10:56:01] root INFO: Loss : [2022/02/18 10:56:01] root INFO: name : CTCLoss [2022/02/18 10:56:01] root INFO: Metric : [2022/02/18 10:56:01] root INFO: main_indicator : acc [2022/02/18 10:56:01] root INFO: name : RecMetric [2022/02/18 10:56:01] root INFO: Optimizer : [2022/02/18 10:56:01] root INFO: beta1 : 0.9 [2022/02/18 10:56:01] root INFO: beta2 : 0.999 [2022/02/18 10:56:01] root INFO: lr : [2022/02/18 10:56:01] root INFO: learning_rate : 0.0005 [2022/02/18 10:56:01] root INFO: name : Adam [2022/02/18 10:56:01] root INFO: regularizer : [2022/02/18 10:56:01] root INFO: factor : 0 [2022/02/18 10:56:01] root INFO: name : L2 [2022/02/18 10:56:01] root INFO: PostProcess : [2022/02/18 10:56:01] root INFO: name : CTCLabelDecode [2022/02/18 10:56:01] root INFO: Train : [2022/02/18 10:56:01] root INFO: dataset : [2022/02/18 10:56:01] root INFO: data_dir : ./train_data/ic15_data/ [2022/02/18 10:56:01] root INFO: label_file_list : ['./train_data/ic15_data/rec_gt_train.txt'] [2022/02/18 10:56:01] root INFO: name : SimpleDataSet [2022/02/18 10:56:01] root INFO: transforms : [2022/02/18 10:56:01] root INFO: DecodeImage : [2022/02/18 10:56:01] root INFO: channel_first : False [2022/02/18 10:56:01] root INFO: img_mode : BGR [2022/02/18 10:56:01] root INFO: CTCLabelEncode : None [2022/02/18 10:56:01] root INFO: RecResizeImg : [2022/02/18 10:56:01] root INFO: image_shape : [3, 32, 100] [2022/02/18 10:56:01] root INFO: KeepKeys : [2022/02/18 10:56:01] root INFO: keep_keys : ['image', 'label', 'length'] [2022/02/18 10:56:01] root INFO: loader : [2022/02/18 10:56:01] root INFO: batch_size_per_card : 256 [2022/02/18 10:56:01] root INFO: drop_last : True [2022/02/18 10:56:01] root INFO: num_workers : 8 [2022/02/18 10:56:01] root INFO: shuffle : True [2022/02/18 10:56:01] root INFO: use_shared_memory : False [2022/02/18 10:56:01] root INFO: profiler_options : None [2022/02/18 10:56:01] root INFO: train with paddle 2.2.2 and device CUDAPlace(0) W0218 10:56:01.829505 5040 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 10.2, Runtime API Version: 10.2 W0218 10:56:01.831923 5040 device_context.cc:465] device: 0, cuDNN Version: 7.6. [2022/02/18 10:56:03] root INFO: resume from /home/znlh/dnn/gdwu/PaddleOCR/output/rec/ic15/best_accuracy [2022/02/18 10:56:03] root INFO: infer_img: /home/znlh/dnn/gdwu/PaddleOCR/train_data/ic15_data/test/qd_021758_2.jpg [2022/02/18 10:56:03] root INFO: result: CAIU 1.0 [2022/02/18 10:56:03] root INFO: success! `python tools/infer/predict_rec.py --image_dir='/home/znlh/dnn/gdwu/PaddleOCR/train_data/ic15_data/test/qd_021758_2.jpg' --rec_model_dir="/home/znlh/dnn/gdwu/PaddleOCR/model/rec_crnn_ctn_gator_rotate" Namespace(benchmark=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir=None, cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=384, det_limit_type='max', det_model_dir=None, det_pse_box_thresh=0.85, det_pse_box_type='box', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/IC15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, gpu_mem=500, image_dir='/home/znlh/dnn/gdwu/PaddleOCR/train_data/ic15_data/test/qd_021758_2.jpg', ir_optim=True, label_list=['0', '180'], max_batch_size=10, max_text_length=25, min_subgraph_size=15, precision='fp32', process_id=0, rec_algorithm='CRNN', rec_batch_num=6, rec_char_dict_path='./ppocr/utils/IC15_dict.txt', rec_image_shape='3, 32, 100', rec_model_dir='/home/znlh/dnn/gdwu/PaddleOCR/model/rec_crnn_ctn_gator_rotate', save_crop_res=False, save_log_path='./log_output/', show_log=True, total_process_num=1, use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=False, use_tensorrt=False, vis_font_path='./doc/fonts/simfang.ttf', warmup=False) [2022/02/18 11:39:45] root INFO: Predicts of /home/znlh/dnn/gdwu/PaddleOCR/train_data/ic15_data/test/qd_021758_2.jpg:('0', 0.672191)

Wakp commented 2 years ago

我这边也出现相同问题，将训练好的动态模型转换成静态模型后，准确率极度下降。我用的是paddleocr2.4，训练的是检测模型（DB）

模型转换命令： python3 tools/export_model.py -c configs/det/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=/home/output/ch_mv3_db_v2.0_pretrained/best_accuracy Global.save_inference_dir=./inference/det_db

tink2123 commented 2 years ago

可能是预测时resize不同导致的，如果训练样本都是较短文本，模型学习到了多余的padding信息，去除padding后可能导致预测精度变化。

可以尝试在预测时把shape固定resize到训练长度，110 行之后添加：

imgW = int(100)

https://github.com/PaddlePaddle/PaddleOCR/blob/a8e0760f2700c64e7c5aa41f4f0a8e9d0ead6e1c/tools/infer/predict_rec.py#L110-L111

GivanTsai commented 2 years ago

可能是预测时resize不同导致的，如果训练样本都是较短文本，模型学习到了多余的padding信息，去除padding后可能导致预测精度变化。

可以尝试在预测时把shape固定resize到训练长度，110 行之后添加：
imgW = int(100)
https://github.com/PaddlePaddle/PaddleOCR/blob/a8e0760f2700c64e7c5aa41f4f0a8e9d0ead6e1c/tools/infer/predict_rec.py#L110-L111

为什么要把宽度固定到100呢

paddle-bot-old[bot] commented 2 years ago

Since you haven\'t replied for more than 3 months, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于您超过三个月未回复，我们将关闭这个issue/pr。若问题未解决或有后续问题，请随时重新打开（建议先拉取最新代码进行尝试），我们会继续跟进。

PaddlePaddle / PaddleOCR

识别模型CRNN从训练模型转预测模型，部分图片预测结果不一致 #5511