PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
42.74k stars 7.68k forks source link

识别推理模型, C++ 与 python 结果不一致的问题 #3213

Closed shisenxd closed 2 years ago

shisenxd commented 3 years ago

yml 配置文件如下:

`# rec_en_number_lite_train.yml # Global: use_gpu: True epoch_num: 3000 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/rec_en_number_lite save_epoch_step: 10

evaluation is run every 100 iterations after the 4000th iteration

eval_batch_step: [0, 100]

if pretrained_model is saved in static mode, load_static_weights must set to True

cal_metric_during_train: True pretrained_model: ./pretrain_models/rec_mv3_none_bilstm_ctc_v2.0_train checkpoints: save_inference_dir: use_visualdl: False infer_img:

for data or label process

character_dict_path: ppocr/utils/en_dict.txt character_type: ch max_text_length: 100 infer_mode: False use_space_char: False

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00001

Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: MobileNetV3 scale: 0.5 model_name: small small_stride: [1, 2, 2, 2] Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 48 Head: name: CTCHead fc_decay: 0.00001

Loss: name: CTCLoss

PostProcess: name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc

Train: dataset: name: SimpleDataSet data_dir: ./train_data/ label_file_list: ["./train_data/train.txt"] transforms:

Eval: dataset: name: SimpleDataSet data_dir: ./train_data/ label_file_list: ["./train_data/test.txt"] transforms:

训练样本20个, 测试样本8个.

识别训练

python tools/train.py -c configs/rec/rec_en_number_lite_train.yml loss 稳定在 0.006左右, acc 为 1.00

预测结果

python tools/infer_rec.py -c configs/rec/rec_en_number_lite_train.yml -o Global.pretrained_model=./output/rec_en_number_lite/best_accuracy Global.load_static_weights=false Global.infer_img=train_data/test_images 训练模型结果正常

训练模型转推理模型

python tools/export_model.py -c configs/rec/rec_en_number_lite_train.yml -o Global.checkpoints=./output/rec_en_number_lite/best_accuracy Global.save_inference_dir=./inference/rec_en_number_lite

推理模型测试

python tools/infer/predict_rec.py --image_dir="./train_data/test_images/00200.png" --rec_model_dir="./inference/rec_en_number_lite/" --rec_char_dict_path="./ppocr/utils/en_dict.txt" 推理模型结果正常, 注释了下面两行

if self.character_type == "ch":

        #imgW = int((32 * max_wh_ratio))

C++ 编译完成后, 执行 ocr_system.exe config.txt test_images, 置信度在 0.98 左右, 少了后半截字符 如果推理模型不注释上面两行的话, 字符也是不全, 和C++效果差不多. 但是查看C++代码并没有进行缩放.

不确定问题在那里.

C++ config.txt 配置文件

rec config

rec_model_dir ./inference/rec_en_number_lite/ char_list_file ./inference/utils/en_dict.txt

其他都是默认的,

for (int k = 0; k < boxes.size(); k++)
{
    std::vector<std::vector<int>> box = boxes[k];

    int x_collect[4] = { box[0][0], box[1][0], box[2][0], box[3][0] };
    int y_collect[4] = { box[0][1], box[1][1], box[2][1], box[3][1] };
    int left = int(*std::min_element(x_collect, x_collect + 4));
    int right = int(*std::max_element(x_collect, x_collect + 4));
    int top = int(*std::min_element(y_collect, y_collect + 4));
    int bottom = int(*std::max_element(y_collect, y_collect + 4));

    cv::Rect rect(left, top, right - left, bottom - top);
    cv::Mat dst = srcimg.clone();
    cv::rectangle(dst, rect, cv::Scalar(0), 2);
    cv::imwrite("dst.bmp", dst);
}

坐标检测也是正常的.

github 上用的 dygraph 分支最新版.

如何排查这个问题, 看了其他 issues, 不太明白.

LDOUBLEV commented 3 years ago

python上注释了 image

C++ 上也屏蔽掉https://github.com/PaddlePaddle/PaddleOCR/blob/e174e9eddf8f5ef359dbf27b94cedd6789aad612/deploy/cpp_infer/src/preprocess_op.cpp#L94

LDOUBLEV commented 3 years ago

另外,你换了英文字典,这里设置为en: https://github.com/PaddlePaddle/PaddleOCR/blob/e174e9eddf8f5ef359dbf27b94cedd6789aad612/tools/infer/utility.py#L72

paddle-bot-old[bot] commented 2 years ago

Since you haven\'t replied for more than 3 months, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于您超过三个月未回复,我们将关闭这个issue/pr。 若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。