opendatalab / PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction
https://pdf-extract-kit.readthedocs.io/zh-cn/latest/index.html
GNU Affero General Public License v3.0
5.27k stars 356 forks source link

How to outputs text in a human-readable order #87

Closed SidneyRey closed 2 months ago

SidneyRey commented 2 months ago

Is there a model or rule that can assemble the output results into continuous text in human-readable order?

myhloli commented 2 months ago

Refer this project:https://github.com/opendatalab/MinerU