Closed sreeram1658 closed 1 week ago
I am trying to extract a table inside my pdf document using fitz -
doc = fitz.open("sample_table.pdf") page = doc[4] tabs = page.find_tables(horizontal_strategy="lines", vertical_strategy="lines",) tab = tabs[0] df = tab.to_pandas() df
My document -
Output comes something like this -
Already explained above
1.24.5
Windows
3.9
This post cannot be accepted as a an issue yet because a reproducing file has not been supplied.
Closed b/o extended period of time without user's reaction.
Description of the bug
I am trying to extract a table inside my pdf document using fitz -
doc = fitz.open("sample_table.pdf") page = doc[4] tabs = page.find_tables(horizontal_strategy="lines", vertical_strategy="lines",) tab = tabs[0] df = tab.to_pandas() df
My document -![image](https://github.com/pymupdf/PyMuPDF/assets/46878288/7d9eecf8-f47f-49b1-9e4f-d12d3edb4741)
Output comes something like this -![image](https://github.com/pymupdf/PyMuPDF/assets/46878288/d2424d4d-8fb8-4956-a88b-f882eb8a2270)
How to reproduce the bug
Already explained above
PyMuPDF version
1.24.5
Operating system
Windows
Python version
3.9