Closed CrabTY closed 1 month ago
PDF文件左侧带有行号,处理结果中行号与正文混合在一起,希望能够去除行号
使用带行号的pdf,如https://media.neurips.cc/Conferences/NeurIPS2023/Styles/neurips_2023.pdf
Linux
3.10
0.6.x
cuda
@CrabTY The extraction of content from PDFs follows objective principles; in this example, you will need to handle the line numbers yourself.
Description of the bug | 错误描述
PDF文件左侧带有行号,处理结果中行号与正文混合在一起,希望能够去除行号
How to reproduce the bug | 如何复现
使用带行号的pdf,如https://media.neurips.cc/Conferences/NeurIPS2023/Styles/neurips_2023.pdf
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.6.x
Device mode | 设备模式
cuda