PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.25k stars 7.82k forks source link

CRNN模型inference结果与predict结果不一致 #5797

Closed Hadesong closed 2 years ago

Hadesong commented 2 years ago

inference结果示例 root INFO: infer_img: /home/aistudio/work/1 (3).png root INFO: result: 15XBXD 0.99732286 predict结果示例 root INFO: Predicts of /home/aistudio/work/1 (3).png:('15XBD', 0.94964397) inference结果是对的 , predict普遍少一个字符 . 是还有哪里配置的预处理后处理没有对齐导致的吗?请告知位置谢谢

配置文件

Global: use_gpu: False epoch_num: 6000 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/rec_en_number_lite save_epoch_step: 20 eval_batch_step: [0, 5000] cal_metric_during_train: True pretrained_model: /home/aistudio/PaddleOCR/en_number_mobile_v2.0_rec_train/best_accuracy.pdparams checkpoints: save_inference_dir: use_visualdl: False infer_img: character_dict_path: /home/aistudio/dict.txt max_text_length: 25 infer_mode: False use_space_char: False Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00001 Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: MobileNetV3 scale: 0.5 model_name: small small_stride: [1, 2, 2, 2] Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 48 Head: name: CTCHead fc_decay: 0.00001 Loss: name: CTCLoss PostProcess: name: CTCLabelDecode Metric: name: RecMetric main_indicator: acc Train: dataset: name: SimpleDataSet data_dir: /home/aistudio/data/ label_file_list: ["/home/aistudio/data/rec_train_gt.txt"] transforms:

  • DecodeImage: # load image img_mode: BGR channel_first: False
  • RecAug:
  • CTCLabelEncode: # Class handling label
  • RecResizeImg: image_shape: [3, 32, 320]
  • KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: True batch_size_per_card: 256 drop_last: True num_workers: 8 Eval: dataset: name: SimpleDataSet data_dir: /home/aistudio/data/ label_file_list: ["/home/aistudio/data/rec_test_gt.txt"] transforms:
  • DecodeImage: # load image img_mode: BGR channel_first: False
  • CTCLabelEncode: # Class handling label
  • RecResizeImg: image_shape: [3, 32, 320]
  • KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 256 num_workers: 8

tools/infer/utility.py

parser.add_argument("--rec_algorithm", type=str, default='CRNN') parser.add_argument("--rec_model_dir", type=str) parser.add_argument("--rec_image_shape", type=str, default="3, 32, 320") parser.add_argument("--rec_batch_num", type=int, default=256) parser.add_argument("--max_text_length", type=int, default=25) parser.add_argument("--rec_char_dict_path", type=str, default="/home/aistudio/dict.txt") parser.add_argument("--use_space_char", type=str2bool, default=False) parser.add_argument("--vis_font_path", type=str, default="./doc/fonts/simfang.ttf") parser.add_argument("--drop_score", type=float, default=0.5)

WenmuZhou commented 2 years ago

inference的时候图片是统一resize到320的,predict的时候是没有resize到固定长度的,可以看一下具体的实现,另外我们在之前的issue也有解答https://github.com/PaddlePaddle/PaddleOCR/issues/270

GivanTsai commented 2 years ago

inference的时候图片是统一resize到320的,predict的时候是没有resize到固定长度的,可以看一下具体的实现,另外我们在之前的issue也有解答#270

predict的时候配置里面不也resize到320了吗

- RecResizeImg:
image_shape: [3, 32, 320]
Hadesong commented 2 years ago

注释掉 imgW = int(32 * max_wh_ratio)解决 predict.py第110行

Sweetwenty commented 2 years ago

注释掉 imgW = int(32 * max_wh_ratio)解决 predict.py第110行

请问inference之后得到的是params和model两个文件吗?我看官方文档里是得到三个.pdparams等后缀类型的文件。

Hadesong commented 2 years ago

注释掉 imgW = int(32 * max_wh_ratio)解决 predict.py第110行

请问inference之后得到的是params和model两个文件吗?我看官方文档里是得到三个.pdparams等后缀类型的文件。

pdparams是训练后的文件 , inference后变成inference.pdiparams , 后面要部署的话还要再变一次