A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
GNU Affero General Public License v3.0
16.11k
stars
1.16k
forks
source link
magic_pdf.user_api:parse_pdf:97 - string index out of range #972
Open
yibie opened 9 hours ago
Description of the bug | 错误描述
测试 MinerU 转换一个 PDF 的时候,出现如下错误:
How to reproduce the bug | 如何复现
命令: magic-pdf -p /Volumes/Collect/archives/Indexing\ The\ Manual\ of\ Good\ Practice\ 2013.pdf -o ~/Documents/temp_convert/ -m auto
Operating system | 操作系统
MacOS
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.9.x
Device mode | 设备模式
cpu