pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
https://pymupdf.readthedocs.io
GNU Affero General Public License v3.0
4.52k stars 446 forks source link

Add dotted gridline detection to table recognition #3539

Closed JorjMcKie closed 1 week ago

JorjMcKie commented 1 month ago

Discussed in https://github.com/pymupdf/PyMuPDF/discussions/3535

Originally posted by **aborruso** May 31, 2024 Hi, first of all thank you for this great tool. I have seen that via python I can use `find_tables`. Is there a way to do something similar via cli and extract a table from a PDF? Thank you
julian-smith-artifex-com commented 1 week ago

Fixed in 1.24.6.

aborruso commented 1 week ago

Thank you very much @JorjMcKie and @julian-smith-artifex-com