opendatalab / PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction
https://pdf-extract-kit.readthedocs.io/zh-cn/latest/index.html
GNU Affero General Public License v3.0
5.27k stars 357 forks source link

在线体验端pdf识别结果问题 #113

Closed X17exe closed 1 month ago

X17exe commented 1 month ago

附示例pdf和识别结果截图,问题:识别结果中图片占比过大,其中内容没有OCR输出 image 889884检测报告.pdf