PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
42.74k stars 7.68k forks source link

识别的infer_rec.py训练引擎和predic_rec.py推理引擎使用同一模型对同一图片预测结果不一致要如何调整 #2362

Closed JawerZ closed 3 years ago

JawerZ commented 3 years ago

aistudio@jupyter-363593-1563909:~/work/PaddleOCR-release-2.0$ python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model=output/rec_chinese_lite_v2.0/best_accuracy Global.load_static_weights=false Global.infer_img=images/cc.jpg Global.use_gpu=false /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp [2021/03/29 16:45:49] root INFO: Architecture : [2021/03/29 16:45:49] root INFO: Backbone : [2021/03/29 16:45:49] root INFO: model_name : small [2021/03/29 16:45:49] root INFO: name : MobileNetV3 [2021/03/29 16:45:49] root INFO: scale : 0.5 [2021/03/29 16:45:49] root INFO: small_stride : [1, 2, 2, 2] [2021/03/29 16:45:49] root INFO: Head : [2021/03/29 16:45:49] root INFO: fc_decay : 1e-05 [2021/03/29 16:45:49] root INFO: name : CTCHead [2021/03/29 16:45:49] root INFO: Neck : [2021/03/29 16:45:49] root INFO: encoder_type : rnn [2021/03/29 16:45:49] root INFO: hidden_size : 48 [2021/03/29 16:45:49] root INFO: name : SequenceEncoder [2021/03/29 16:45:49] root INFO: Transform : None [2021/03/29 16:45:49] root INFO: algorithm : CRNN [2021/03/29 16:45:49] root INFO: model_type : rec [2021/03/29 16:45:49] root INFO: Eval : [2021/03/29 16:45:49] root INFO: dataset : [2021/03/29 16:45:49] root INFO: data_dir : ./data [2021/03/29 16:45:49] root INFO: label_file_list : ['./data/test_back_rec_label.txt'] [2021/03/29 16:45:49] root INFO: name : SimpleDataSet [2021/03/29 16:45:49] root INFO: transforms : [2021/03/29 16:45:49] root INFO: DecodeImage : [2021/03/29 16:45:49] root INFO: channel_first : False [2021/03/29 16:45:49] root INFO: img_mode : BGR [2021/03/29 16:45:49] root INFO: CTCLabelEncode : None [2021/03/29 16:45:49] root INFO: RecResizeImg : [2021/03/29 16:45:49] root INFO: image_shape : [3, 32, 320] [2021/03/29 16:45:49] root INFO: KeepKeys : [2021/03/29 16:45:49] root INFO: keep_keys : ['image', 'label', 'length'] [2021/03/29 16:45:49] root INFO: loader : [2021/03/29 16:45:49] root INFO: batch_size_per_card : 256 [2021/03/29 16:45:49] root INFO: drop_last : False [2021/03/29 16:45:49] root INFO: num_workers : 0 [2021/03/29 16:45:49] root INFO: shuffle : False [2021/03/29 16:45:49] root INFO: Global : [2021/03/29 16:45:49] root INFO: cal_metric_during_train : True [2021/03/29 16:45:49] root INFO: character_dict_path : ppocr/utils/ppocr_keys_v1.txt [2021/03/29 16:45:49] root INFO: character_type : ch [2021/03/29 16:45:49] root INFO: checkpoints : ./output/rec_chinese_lite_v2.0/best_accuracy [2021/03/29 16:45:49] root INFO: debug : False [2021/03/29 16:45:49] root INFO: distributed : False [2021/03/29 16:45:49] root INFO: epoch_num : 500 [2021/03/29 16:45:49] root INFO: eval_batch_step : [0, 200] [2021/03/29 16:45:49] root INFO: infer_img : images/cc.jpg [2021/03/29 16:45:49] root INFO: infer_mode : False [2021/03/29 16:45:49] root INFO: load_static_weights : False [2021/03/29 16:45:49] root INFO: log_smooth_window : 20 [2021/03/29 16:45:49] root INFO: max_text_length : 25 [2021/03/29 16:45:49] root INFO: pretrained_model : output/rec_chinese_lite_v2.0/best_accuracy [2021/03/29 16:45:49] root INFO: print_batch_step : 10 [2021/03/29 16:45:49] root INFO: save_epoch_step : 3 [2021/03/29 16:45:49] root INFO: save_inference_dir : None [2021/03/29 16:45:49] root INFO: save_model_dir : ./output/rec_chinese_lite_v2.0 [2021/03/29 16:45:49] root INFO: use_gpu : False [2021/03/29 16:45:49] root INFO: use_space_char : True [2021/03/29 16:45:49] root INFO: use_visualdl : False [2021/03/29 16:45:49] root INFO: Loss : [2021/03/29 16:45:49] root INFO: name : CTCLoss [2021/03/29 16:45:49] root INFO: Metric : [2021/03/29 16:45:49] root INFO: main_indicator : acc [2021/03/29 16:45:49] root INFO: name : RecMetric [2021/03/29 16:45:49] root INFO: Optimizer : [2021/03/29 16:45:49] root INFO: beta1 : 0.9 [2021/03/29 16:45:49] root INFO: beta2 : 0.999 [2021/03/29 16:45:49] root INFO: lr : [2021/03/29 16:45:49] root INFO: learning_rate : 0.001 [2021/03/29 16:45:49] root INFO: name : Cosine [2021/03/29 16:45:49] root INFO: name : Adam [2021/03/29 16:45:49] root INFO: regularizer : [2021/03/29 16:45:49] root INFO: factor : 1e-05 [2021/03/29 16:45:49] root INFO: name : L2 [2021/03/29 16:45:49] root INFO: PostProcess : [2021/03/29 16:45:49] root INFO: name : CTCLabelDecode [2021/03/29 16:45:49] root INFO: Train : [2021/03/29 16:45:49] root INFO: dataset : [2021/03/29 16:45:49] root INFO: data_dir : ./data/ [2021/03/29 16:45:49] root INFO: label_file_list : ['./data/train_back_rec_label.txt'] [2021/03/29 16:45:49] root INFO: name : SimpleDataSet [2021/03/29 16:45:49] root INFO: transforms : [2021/03/29 16:45:49] root INFO: DecodeImage : [2021/03/29 16:45:49] root INFO: channel_first : False [2021/03/29 16:45:49] root INFO: img_mode : BGR [2021/03/29 16:45:49] root INFO: RecAug : None [2021/03/29 16:45:49] root INFO: CTCLabelEncode : None [2021/03/29 16:45:49] root INFO: RecResizeImg : [2021/03/29 16:45:49] root INFO: image_shape : [3, 32, 320] [2021/03/29 16:45:49] root INFO: KeepKeys : [2021/03/29 16:45:49] root INFO: keep_keys : ['image', 'label', 'length'] [2021/03/29 16:45:49] root INFO: loader : [2021/03/29 16:45:49] root INFO: batch_size_per_card : 256 [2021/03/29 16:45:49] root INFO: drop_last : True [2021/03/29 16:45:49] root INFO: num_workers : 0 [2021/03/29 16:45:49] root INFO: shuffle : True [2021/03/29 16:45:49] root INFO: train with paddle 2.0.1 and device CPUPlace [2021/03/29 16:45:49] root INFO: resume from ./output/rec_chinese_lite_v2.0/best_accuracy [2021/03/29 16:45:49] root INFO: infer_img: images/cc.jpg [2021/03/29 16:45:49] root INFO: result: ('[5', 0.9997947) [2021/03/29 16:45:49] root INFO: success! aistudio@jupyter-363593-1563909:~/work/PaddleOCR-release-2.0$ python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.load_static_weights=False Global.save_inference_dir=./inference/rec_crnn/ /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp [2021/03/29 16:46:48] root INFO: resume from ./output/rec_chinese_lite_v2.0/best_accuracy [2021/03/29 16:46:49] root INFO: inference model is saved to ./inference/rec_crnn//inference aistudio@jupyter-363593-1563909:~/work/PaddleOCR-release-2.0$ python3 tools/infer/predict_rec.py --image_dir="./images/cc.jpg" --rec_model_dir="./inference/rec_crnn/" --rec_image_shape="3, 32, 100" --rec_char_type="ch" `--use_gpu=false /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp [2021/03/29 16:47:34] root INFO: Predicts of ./images/cc.jpg:('L5', 0.97822416) [2021/03/29 16:47:34] root INFO: Total predict time for 1 images, cost: 0.027

请问要如何调整

WenmuZhou commented 3 years ago

python3 tools/infer/predict_rec.py --image_dir="./images/cc.jpg" --rec_model_dir="./inference/rec_crnn/" --rec_image_shape="3, 32, 320" --rec_char_type="ch" `--use_gpu=false

JawerZ commented 3 years ago

python3 tools/infer/predict_rec.py --image_dir="./images/cc.jpg" --rec_model_dir="./inference/rec_crnn/" --rec_image_shape="3, 32, 320" --rec_char_type="ch" `--use_gpu=false

修改输入图片尺寸后 预测结果依然不一致 请问是否存在其他问题 aistudio@jupyter-363593-1563909:~/work/PaddleOCR-release-2.0$ python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model=output/rec_chinese_lite_v2.0/best_accuracy Global.load_static_weights=false Global.infer_img=images/cc.jpg Global.use_gpu=false /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp [2021/03/30 14:34:55] root INFO: Architecture : [2021/03/30 14:34:55] root INFO: Backbone : [2021/03/30 14:34:55] root INFO: model_name : small [2021/03/30 14:34:55] root INFO: name : MobileNetV3 [2021/03/30 14:34:55] root INFO: scale : 0.5 [2021/03/30 14:34:55] root INFO: small_stride : [1, 2, 2, 2] [2021/03/30 14:34:55] root INFO: Head : [2021/03/30 14:34:55] root INFO: fc_decay : 1e-05 [2021/03/30 14:34:55] root INFO: name : CTCHead [2021/03/30 14:34:55] root INFO: Neck : [2021/03/30 14:34:55] root INFO: encoder_type : rnn [2021/03/30 14:34:55] root INFO: hidden_size : 48 [2021/03/30 14:34:55] root INFO: name : SequenceEncoder [2021/03/30 14:34:55] root INFO: Transform : None [2021/03/30 14:34:55] root INFO: algorithm : CRNN [2021/03/30 14:34:55] root INFO: model_type : rec [2021/03/30 14:34:55] root INFO: Eval : [2021/03/30 14:34:55] root INFO: dataset : [2021/03/30 14:34:55] root INFO: data_dir : ./data/ [2021/03/30 14:34:55] root INFO: label_file_list : ['./data/test_back_rec_label.txt'] [2021/03/30 14:34:55] root INFO: name : SimpleDataSet [2021/03/30 14:34:55] root INFO: transforms : [2021/03/30 14:34:55] root INFO: DecodeImage : [2021/03/30 14:34:55] root INFO: channel_first : False [2021/03/30 14:34:55] root INFO: img_mode : BGR [2021/03/30 14:34:55] root INFO: CTCLabelEncode : None [2021/03/30 14:34:55] root INFO: RecResizeImg : [2021/03/30 14:34:55] root INFO: image_shape : [3, 32, 320] [2021/03/30 14:34:55] root INFO: KeepKeys : [2021/03/30 14:34:55] root INFO: keep_keys : ['image', 'label', 'length'] [2021/03/30 14:34:55] root INFO: loader : [2021/03/30 14:34:55] root INFO: batch_size_per_card : 256 [2021/03/30 14:34:55] root INFO: drop_last : False [2021/03/30 14:34:55] root INFO: num_workers : 1 [2021/03/30 14:34:55] root INFO: shuffle : False [2021/03/30 14:34:55] root INFO: Global : [2021/03/30 14:34:55] root INFO: cal_metric_during_train : True [2021/03/30 14:34:55] root INFO: character_dict_path : ppocr/utils/ppocr_keys_v1.txt [2021/03/30 14:34:55] root INFO: character_type : ch [2021/03/30 14:34:55] root INFO: checkpoints : None [2021/03/30 14:34:55] root INFO: debug : False [2021/03/30 14:34:55] root INFO: distributed : False [2021/03/30 14:34:55] root INFO: epoch_num : 500 [2021/03/30 14:34:55] root INFO: eval_batch_step : [0, 200] [2021/03/30 14:34:55] root INFO: infer_img : images/cc.jpg [2021/03/30 14:34:55] root INFO: infer_mode : False [2021/03/30 14:34:55] root INFO: load_static_weights : False [2021/03/30 14:34:55] root INFO: log_smooth_window : 20 [2021/03/30 14:34:55] root INFO: max_text_length : 25 [2021/03/30 14:34:55] root INFO: pretrained_model : output/rec_chinese_lite_v2.0/best_accuracy [2021/03/30 14:34:55] root INFO: print_batch_step : 10 [2021/03/30 14:34:55] root INFO: save_epoch_step : 3 [2021/03/30 14:34:55] root INFO: save_inference_dir : ./inference/recc/ [2021/03/30 14:34:55] root INFO: save_model_dir : ./output/rec_chinese_lite [2021/03/30 14:34:55] root INFO: use_gpu : False [2021/03/30 14:34:55] root INFO: use_space_char : True [2021/03/30 14:34:55] root INFO: use_visualdl : False [2021/03/30 14:34:55] root INFO: Loss : [2021/03/30 14:34:55] root INFO: name : CTCLoss [2021/03/30 14:34:55] root INFO: Metric : [2021/03/30 14:34:55] root INFO: main_indicator : acc [2021/03/30 14:34:55] root INFO: name : RecMetric [2021/03/30 14:34:55] root INFO: Optimizer : [2021/03/30 14:34:55] root INFO: beta1 : 0.9 [2021/03/30 14:34:55] root INFO: beta2 : 0.999 [2021/03/30 14:34:55] root INFO: lr : [2021/03/30 14:34:55] root INFO: learning_rate : 0.001 [2021/03/30 14:34:55] root INFO: name : Cosine [2021/03/30 14:34:55] root INFO: name : Adam [2021/03/30 14:34:55] root INFO: regularizer : [2021/03/30 14:34:55] root INFO: factor : 1e-05 [2021/03/30 14:34:55] root INFO: name : L2 [2021/03/30 14:34:55] root INFO: PostProcess : [2021/03/30 14:34:55] root INFO: name : CTCLabelDecode [2021/03/30 14:34:55] root INFO: Train : [2021/03/30 14:34:55] root INFO: dataset : [2021/03/30 14:34:55] root INFO: data_dir : ./data/ [2021/03/30 14:34:55] root INFO: label_file_list : ['./data/train_back_rec_label.txt'] [2021/03/30 14:34:55] root INFO: name : SimpleDataSet [2021/03/30 14:34:55] root INFO: transforms : [2021/03/30 14:34:55] root INFO: DecodeImage : [2021/03/30 14:34:55] root INFO: channel_first : False [2021/03/30 14:34:55] root INFO: img_mode : BGR [2021/03/30 14:34:55] root INFO: RecAug : None [2021/03/30 14:34:55] root INFO: CTCLabelEncode : None [2021/03/30 14:34:55] root INFO: RecResizeImg : [2021/03/30 14:34:55] root INFO: image_shape : [3, 32, 320] [2021/03/30 14:34:55] root INFO: KeepKeys : [2021/03/30 14:34:55] root INFO: keep_keys : ['image', 'label', 'length'] [2021/03/30 14:34:55] root INFO: loader : [2021/03/30 14:34:55] root INFO: batch_size_per_card : 256 [2021/03/30 14:34:55] root INFO: drop_last : True [2021/03/30 14:34:55] root INFO: num_workers : 1 [2021/03/30 14:34:55] root INFO: shuffle : True [2021/03/30 14:34:55] root INFO: train with paddle 2.0.1 and device CPUPlace [2021/03/30 14:34:55] root INFO: load pretrained model from ['output/rec_chinese_lite_v2.0/best_accuracy'] [2021/03/30 14:34:55] root INFO: infer_img: images/cc.jpg [2021/03/30 14:34:55] root INFO: result: ('[5', 0.9997947) [2021/03/30 14:34:55] root INFO: success! aistudio@jupyter-363593-1563909:~/work/PaddleOCR-release-2.0$ python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.load_static_weights=False Global.pretrained_model=output/rec_chinese_lite_v2.0/best_accuracy Global.save_inference_dir=./inference/rec_crnn/ /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp W0330 14:37:04.993947 12875 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1 W0330 14:37:04.999480 12875 device_context.cc:372] device: 0, cuDNN Version: 7.6. [2021/03/30 14:37:10] root INFO: load pretrained model from ['output/rec_chinese_lite_v2.0/best_accuracy'] [2021/03/30 14:37:11] root INFO: inference model is saved to ./inference/rec_crnn//inference aistudio@jupyter-363593-1563909:~/work/PaddleOCR-release-2.0$ python3 tools/infer/predict_rec.py --image_dir="./images/cc.jpg" --rec_model_dir="./inference/rec_crnn/" --rec_image_shape="3, 32, 320" --rec_char_type="ch" --use_gpu=false /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp [2021/03/30 14:37:29] root INFO: Predicts of ./images/cc.jpg:('L5', 0.97822416) [2021/03/30 14:37:29] root INFO: Total predict time for 1 images, cost: 0.018

WenmuZhou commented 3 years ago

方便比较一下两次infer时候图片有没有差异吗

JawerZ commented 3 years ago

方便比较一下两次infer时候图片有没有差异吗

用的是同一张图片 代码是最新的没有做改动 但是目前还不清楚resize后是否一致
https://github.com/PaddlePaddle/PaddleOCR/blob/cfdefbe1bae14d4216db08e65253e6265b6927a5/tools/infer_rec.py#L97 后获取图片信息得到的是 [2021/03/30 16:09:03] root INFO: [[[[-0.12941176, 0.04313731, 0.10588241, ..., 0. , 0. , 0. ], [-0.18431371, -0.02745098, 0.02745104, ..., 0. , 0. , 0. ], [-0.19999999, -0.05098039, -0.00392157, ..., 0. , 0. , 0. ], ..., [-0.20784312, -0.05098039, 0.00392163, ..., 0. , 0. , 0. ], [-0.20784312, -0.05098039, -0.00392157, ..., 0. , 0. , 0. ], [-0.20784312, -0.05098039, -0.00392157, ..., 0. , 0. , 0. ]], [[ 0.51372552, 0.68627453, 0.74901962, ..., 0. , 0. , 0. ], [ 0.54509807, 0.70196080, 0.75686276, ..., 0. , 0. , 0. ], [ 0.56078434, 0.70980394, 0.75686276, ..., 0. , 0. , 0. ], ..., [ 0.54509807, 0.70196080, 0.75686276, ..., 0. , 0. , 0. ], [ 0.54509807, 0.70196080, 0.74901962, ..., 0. , 0. , 0. ], [ 0.55294120, 0.70980394, 0.75686276, ..., 0. , 0. , 0. ]], [[-0.04313725, 0.13725495, 0.20000005, ..., 0. , 0. , 0. ], [-0.08235294, 0.07450986, 0.12941182, ..., 0. , 0. , 0. ], [-0.09803921, 0.05882359, 0.10588241, ..., 0. , 0. , 0. ], ..., [-0.09803921, 0.05098045, 0.09803927, ..., 0. , 0. , 0. ], [-0.10588235, 0.05098045, 0.09803927, ..., 0. , 0. , 0. ], [-0.09803921, 0.05882359, 0.10588241, ..., 0. , 0. , 0. ]]]])

https://github.com/PaddlePaddle/PaddleOCR/blob/cfdefbe1bae14d4216db08e65253e6265b6927a5/tools/infer/predict_rec.py#L201 后获取图片信息得到的是 [2021/03/30 16:09:03] root INFO: [[[-0.12941176 0.05098045 0.10588241 ... 0.10588241 0.20000005 0.5764706 ] [-0.18431371 -0.02745098 0.02745104 ... 0.02745104 0.13725495 0.56078434] [-0.19999999 -0.04313725 -0.00392157 ... 0.00392163 0.11372554 0.5529412 ] ... [-0.20784312 -0.04313725 0.00392163 ... -0.01176471 0.09803927 0.5294118 ] [-0.20784312 -0.05098039 -0.00392157 ... -0.01176471 0.10588241 0.54509807] [-0.20784312 -0.05098039 -0.00392157 ... -0.01176471 0.11372554 0.5529412 ]] [[ 0.5137255 0.69411767 0.7490196 ... 0.75686276 0.78039217 0.85882354] [ 0.54509807 0.70980394 0.75686276 ... 0.7647059 0.7882353 0.88235295] [ 0.56078434 0.7176471 0.75686276 ... 0.75686276 0.7882353 0.8901961 ] ... [ 0.54509807 0.70980394 0.75686276 ... 0.7411765 0.7647059 0.8666667 ] [ 0.54509807 0.70980394 0.7490196 ... 0.7411765 0.77254903 0.8745098 ] [ 0.5529412 0.70980394 0.75686276 ... 0.7490196 0.7882353 0.8901961 ]] [[-0.04313725 0.14509809 0.20000005 ... 0.20000005 0.28627455 0.6156863 ] [-0.08235294 0.082353 0.12941182 ... 0.12941182 0.2313726 0.60784316] [-0.09803921 0.05882359 0.10588241 ... 0.10588241 0.21568632 0.60784316] ... [-0.09803921 0.05882359 0.09803927 ... 0.09019613 0.19215691 0.5764706 ] [-0.10588235 0.05882359 0.09803927 ... 0.09019613 0.20000005 0.58431375] [-0.09803921 0.05882359 0.10588241 ... 0.09803927 0.20784318 0.6 ]]] 我不知道这样获取是否正确 如果不正确请问正确的获取方式是怎么样的

WenmuZhou commented 3 years ago

infer_rec.py对图片进行了宽度不足的pad,predict_rec没有进行这个操作,https://github.com/PaddlePaddle/PaddleOCR/blob/cfdefbe1bae14d4216db08e65253e6265b6927a5/tools/infer/predict_rec.py#L78 这里改为resized_w = imgW应该就是一致的

JawerZ commented 3 years ago

infer_rec.py对图片进行了宽度不足的pad,predict_rec没有进行这个操作,

https://github.com/PaddlePaddle/PaddleOCR/blob/cfdefbe1bae14d4216db08e65253e6265b6927a5/tools/infer/predict_rec.py#L78

这里改为resized_w = imgW应该就是一致的

已解决 infer_rec.py处理图片对应使用的是这个方法 和predict_rec.py的处理存在差异 感谢您的解答 https://github.com/PaddlePaddle/PaddleOCR/blob/cfdefbe1bae14d4216db08e65253e6265b6927a5/ppocr/data/imaug/rec_img_aug.py#L109