ChrizH / pdfstructure

`pdfstructure` detects, splits and organizes the documents text content into its natural structure as envisioned by the author.
101 stars 20 forks source link

pdfminer.six alternative? #7

Open IdeaSense opened 2 years ago

IdeaSense commented 2 years ago

Would this work as an alternative? https://huggingface.co/spaces/deepdoctection/deepdoctection

PS. Someone please create a colab or gradio so this project can be used by a wider audience.

stefan6419846 commented 8 months ago

I would argue that there are just different approaches for it - either using heuristics as in this project or using ML as you propose.