adithya-s-k / omniparse

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
https://docs.cognitivelab.in
GNU General Public License v3.0
5.13k stars 430 forks source link

Extract the content of the table in the PDF #47

Closed xgmeng closed 2 months ago

xgmeng commented 3 months ago

The content of the table in the PDF cannot be correctly recognized, which may result in incorrect rows and columns. 【Raw data】

image

【Result】

image
adithya-s-k commented 2 months ago

working on imporving the underlying table detection algorithms