PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
21.79k stars 5.47k forks source link

paddleocr 基于python部署训练环境,训练的文字识别模型在C++源码部署加入该模型文件后不适用,报异常关于Paddle_inference的问题 #61957

Open zhaocanglong opened 4 months ago

zhaocanglong commented 4 months ago

请提出你的问题 Please ask your question

ch_PP-OCRv3_rec_infer.zip 屏幕截图 2024-02-22 100245 `// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License.

include <gflags/gflags.h>

// common args DEFINE_bool(use_gpu, false, "Infering with GPU or CPU."); DEFINE_bool(use_tensorrt, false, "Whether use tensorrt."); DEFINE_int32(gpu_id, 0, "Device id of GPU to execute."); DEFINE_int32(gpu_mem, 4000, "GPU id when infering with GPU."); DEFINE_int32(cpu_threads, 10, "Num of threads with CPU."); DEFINE_bool(enable_mkldnn, true, "Whether use mkldnn with CPU."); DEFINE_string(precision, "fp32", "Precision be one of fp32/fp16/int8"); DEFINE_bool(benchmark, false, "Whether use benchmark."); DEFINE_string(output, "./output/", "Save benchmark log path."); DEFINE_string(image_dir, "./11.jpg", "Dir of input image."); DEFINE_string( type, "ocr", "Perform ocr or structure, the value is selected in ['ocr','structure']."); // detection related DEFINE_string(det_model_dir, "./models/ch_PP-OCRv3_det_infer", "Path of det inference model."); DEFINE_string(limit_type, "max", "limit_type of input image."); DEFINE_int32(limit_side_len, 960, "limit_side_len of input image."); DEFINE_double(det_db_thresh, 0.3, "Threshold of det_db_thresh."); DEFINE_double(det_db_box_thresh, 0.6, "Threshold of det_db_box_thresh."); DEFINE_double(det_db_unclip_ratio, 1.5, "Threshold of det_db_unclip_ratio."); DEFINE_bool(use_dilation, false, "Whether use the dilation on output map."); DEFINE_string(det_db_score_mode, "slow", "Whether use polygon score."); DEFINE_bool(visualize, true, "Whether show the detection results."); // classification related DEFINE_bool(use_angle_cls, false, "Whether use use_angle_cls."); DEFINE_string(cls_model_dir, "", "Path of cls inference model."); DEFINE_double(cls_thresh, 0.9, "Threshold of cls_thresh."); DEFINE_int32(cls_batch_num, 1, "cls_batch_num."); // recognition related DEFINE_string(rec_model_dir, "./models/ch_PP-OCRv3_rec_infer", "Path of rec inference model."); DEFINE_int32(rec_batch_num, 6, "rec_batch_num."); DEFINE_string(rec_char_dict_path, "./ppocr_keys_v1.txt", "Path of dictionary."); DEFINE_int32(rec_img_h, 32, "rec image height"); DEFINE_int32(rec_img_w, 320, "rec image width");

// layout model related DEFINE_string(layout_model_dir, "", "Path of table layout inference model."); DEFINE_string(layout_dict_path, "./layout_table_dict.txt", "Path of dictionary."); DEFINE_double(layout_score_threshold, 0.5, "Threshold of score."); DEFINE_double(layout_nms_threshold, 0.5, "Threshold of nms."); // structure model related DEFINE_string(table_model_dir, "", "Path of table struture inference model."); DEFINE_int32(table_max_len, 488, "max len size of input image."); DEFINE_int32(table_batch_num, 1, "table_batch_num."); DEFINE_bool(merge_no_span_structure, true, "Whether merge and to "); DEFINE_string(table_char_dict_path, "./table_structure_dict_ch.txt", "Path of dictionary.");

// ocr forward related DEFINE_bool(det, true, "Whether use det in forward."); DEFINE_bool(rec, true, "Whether use rec in forward."); DEFINE_bool(cls, false, "Whether use cls in forward."); DEFINE_bool(table, false, "Whether use table structure in forward."); DEFINE_bool(layout, false, "Whether use layout analysis in forward."); // Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License.

include "opencv2/core.hpp"

include "opencv2/imgcodecs.hpp"

include "opencv2/imgproc.hpp"

include

include

include <include/args.h>

include <include/paddleocr.h>

include <include/paddlestructure.h>

using namespace PaddleOCR;

void check_params() { if (FLAGS_det) { if (FLAGS_det_model_dir.empty() || FLAGS_image_dir.empty()) { std::cout << "Usage[det]: ./ppocr " "--det_model_dir=/PATH/TO/DET_INFERENCE_MODEL/ " << "--image_dir=/PATH/TO/INPUT/IMAGE/" << std::endl; exit(1); } } if (FLAGS_rec) { std::cout << "In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320'," "if you are using recognition model with PP-OCRv2 or an older " "version, " "please set --rec_image_shape='3,32,320" << std::endl; if (FLAGS_rec_model_dir.empty() || FLAGS_image_dir.empty()) { std::cout << "Usage[rec]: ./ppocr " "--rec_model_dir=/PATH/TO/REC_INFERENCE_MODEL/ " << "--image_dir=/PATH/TO/INPUT/IMAGE/" << std::endl; exit(1); } } if (FLAGS_cls && FLAGS_use_angle_cls) { if (FLAGS_cls_model_dir.empty() || FLAGS_image_dir.empty()) { std::cout << "Usage[cls]: ./ppocr " << "--cls_model_dir=/PATH/TO/REC_INFERENCE_MODEL/ " << "--image_dir=/PATH/TO/INPUT/IMAGE/" << std::endl; exit(1); } } if (FLAGS_table) { if (FLAGS_table_model_dir.empty() || FLAGS_det_model_dir.empty() || FLAGS_rec_model_dir.empty() || FLAGS_image_dir.empty()) { std::cout << "Usage[table]: ./ppocr " << "--det_model_dir=/PATH/TO/DET_INFERENCE_MODEL/ " << "--rec_model_dir=/PATH/TO/REC_INFERENCE_MODEL/ " << "--table_model_dir=/PATH/TO/TABLE_INFERENCE_MODEL/ " << "--image_dir=/PATH/TO/INPUT/IMAGE/" << std::endl; exit(1); } } if (FLAGS_layout) { if (FLAGS_layout_model_dir.empty() || FLAGS_image_dir.empty()) { std::cout << "Usage[layout]: ./ppocr " << "--layout_model_dir=/PATH/TO/LAYOUT_INFERENCE_MODEL/ " << "--image_dir=/PATH/TO/INPUT/IMAGE/" << std::endl; exit(1); } } if (FLAGS_precision != "fp32" && FLAGS_precision != "fp16" && FLAGS_precision != "int8") { std::cout << "precison should be 'fp32'(default), 'fp16' or 'int8'. " << std::endl; exit(1); } }

std::string ocr(std::vector &cv_all_img_names) { PPOCR ocr;

if (FLAGS_benchmark) { ocr.reset_timer(); }

std::vector img_list; std::vector img_names; for (int i = 0; i < cv_all_img_names.size(); ++i) { cv::Mat img = cv::imread(cv_all_img_names[i], cv::IMREAD_COLOR); if (!img.data) { std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << std::endl; continue; } img_list.push_back(img); img_names.push_back(cv_all_img_names[i]); }

std::vector<std::vector> ocr_results = ocr.ocr(img_list, FLAGS_det, FLAGS_rec, FLAGS_cls); std::string res_str = ""; for (int i = 0; i < img_names.size(); ++i) { std::cout << "predict img: " << cv_all_img_names[i] << std::endl;

Utility::print_result(ocr_results[i], res_str);
//Utility::print_result(ocr_results[i]);
if (FLAGS_visualize && FLAGS_det) {
  std::string file_name = Utility::basename(img_names[i]);
  cv::Mat srcimg = img_list[i];
  Utility::VisualizeBboxes(srcimg, ocr_results[i],
                           FLAGS_output + "/" + file_name);
}

} if (FLAGS_benchmark) { ocr.benchmark_log(cv_all_img_names.size()); } return res_str; }

void structure(std::vector &cv_all_img_names) { PaddleOCR::PaddleStructure engine;

if (FLAGS_benchmark) { engine.reset_timer(); }

for (int i = 0; i < cv_all_img_names.size(); i++) { std::cout << "predict img: " << cv_all_img_names[i] << std::endl; cv::Mat img = cv::imread(cv_all_img_names[i], cv::IMREAD_COLOR); if (!img.data) { std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << std::endl; continue; }

std::vector<StructurePredictResult> structure_results = engine.structure(
    img, FLAGS_layout, FLAGS_table, FLAGS_det && FLAGS_rec);

for (int j = 0; j < structure_results.size(); j++) {
  std::cout << j << "\ttype: " << structure_results[j].type
            << ", region: [";
  std::cout << structure_results[j].box[0] << ","
            << structure_results[j].box[1] << ","
            << structure_results[j].box[2] << ","
            << structure_results[j].box[3] << "], score: ";
  std::cout << structure_results[j].confidence << ", res: ";

  if (structure_results[j].type == "table") {
    std::cout << structure_results[j].html << std::endl;
    if (structure_results[j].cell_box.size() > 0 && FLAGS_visualize) {
      std::string file_name = Utility::basename(cv_all_img_names[i]);

      Utility::VisualizeBboxes(img, structure_results[j],
                               FLAGS_output + "/" + std::to_string(j) +
                                   "_" + file_name);
    }
  } else {
    std::cout << "count of ocr result is : "
              << structure_results[j].text_res.size() << std::endl;
    if (structure_results[j].text_res.size() > 0) {
      std::cout << "********** print ocr result "
                << "**********" << std::endl;
      Utility::print_result(structure_results[j].text_res);
      std::cout << "********** end print ocr result "
                << "**********" << std::endl;
    }
  }
}

} if (FLAGS_benchmark) { engine.benchmark_log(cv_all_img_names.size()); } }

int main(int argc, char *argv) { // Parsing command-line / google::ParseCommandLineFlags(&argc, &argv, true);*/ check_params();

if (!Utility::PathExists(FLAGS_image_dir)) { std::cerr << "[ERROR] image path not exist! image_dir: " << FLAGS_image_dir << std::endl; exit(1); }

std::vector cv_all_img_names; cv::glob(FLAGS_image_dir, cv_all_img_names); std::cout << "total images num: " << cv_all_img_names.size() << std::endl;

if (!Utility::PathExists(FLAGS_output)) { Utility::CreateDir(FLAGS_output); } static std::string res = ""; if (FLAGS_type == "ocr") { res = ocr(cv_all_img_names); } else if (FLAGS_type == "structure") { structure(cv_all_img_names); } else { std::cout << "only value in ['ocr','structure'] is supported" << std::endl; } std::cout << res << std::endl; } `

tink2123 commented 4 months ago

使用python部署可以正常推理吗?

zhaocanglong commented 4 months ago

使用python的是可以的,paddleocr2.6版本中提供的训练模型的方法去做的,验证后是可以的。这个是Windows GPU版本的,实际上运用模型的采用C++源码部署的,windows cpu版本。

raoyutian commented 4 months ago

经过测试,该模型,在windows cpu下,C++推理可以正常使用. paddle_inference v2.6, 我的paddleocr,是基于官方C++源码修改的。

可以参考我的修改版本测试一下: https://gitee.com/raoyutian/paddle-ocrsharp

zhaocanglong commented 4 months ago

那对于修改的范围来说,我是要修改源码,还是要修改配置参数呢

raoyutian commented 4 months ago

看错误截图,是缺少符号文件,符号文件一般是调试模式下需要的。是否哪里设置debug模式了? paddle_inference一般是release编译,没有符号文件。 你可以检查下,各个编译选项是否都是release编译,再则尝试不用VS调试,直接运行编译后的exe文件看看。