PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.58k stars 7.85k forks source link

图像Layout检测,一行文字丢失前面部分 #11869

Closed xuwinrar closed 7 months ago

xuwinrar commented 7 months ago

版本识别结果如图(粗红框部分文字前半部分丢失):

1712114916588

原图如下: 1712115073155

使用版本: paddleocr 2.7.0.3 paddlepaddle 2.5.1

模型配置: 使用默认值 PP-StructureV2 'layout': { 'en': { 'url': 'https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar', 'dict_path': 'ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt' }, 'ch': { 'url': 'https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar', 'dict_path': 'ppocr/utils/dict/layout_dict/layout_cdla_dict.txt' } }

参数设置: table_engine = PPStructure(recovery=True, show_log=True, image_orientation=False, ocr_version='PP-OCRv4', det_db_thresh=0.1, det_db_box_thresh=0.4, det_db_unclip_ratio=2, det_db_score_mode='slow', det_limit_side_len=3330, use_dilation=True, use_mkldnn=True)

试过调整参数但都无效。请问这个问题如何解决?谢谢!

xuwinrar commented 7 months ago

@Sunting78 , need your help, please respond thanks!

RussellLuo commented 7 months ago

@xuwinrar 这里有个修复PR:#11916,可以测试下。