ArtifexSoftware / pdf2docx

Open source Python library for converting PDF to DOCX.
https://pdf2docx.readthedocs.io
GNU Affero General Public License v3.0
2.55k stars 373 forks source link

pdf2docx-0.5.8版,将附件"深入浅出强化学习01.pdf"转docx后,每段首句被移到末尾了 #273

Closed ericshenjs closed 1 month ago

ericshenjs commented 7 months ago

common\Collection.py文件sort_in_line_order函数中作如下修改可修复:

if not self.is_vertical_text:

if self.is_vertical_text:
ericshenjs commented 7 months ago

layout_修改前.json layout_修改后.json 源代码"if not self.is_vertical_text",debug后生成layout_修改前.json 源代码修改为"if self.is_vertical_text",debug后生成layout_修改后.json