atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.61k stars 349 forks source link

While using "stream" mode, I tried passing table_regions = ['10,710,604,72'](which I found using plot), it worked and gave me an output df. But, I also tried table_regions = ['170,370,560,270'](which I found in camelot documentation). This also gave me the same output. The coordinates are not even close, but how is camelot able to detect the table in pdf? #501

Open Jashwanthreddy14 opened 6 months ago

Jashwanthreddy14 commented 6 months ago

I have also tried manually finding coordinates and passing it in table_regions. In few cases, camelot is detecting 1 table as 2 tables. The 2nd table detected by camelot is having duplicate data which is already in table 1.

And the most weird thing is table_regions = ['170,370,560,270'](which I found in camelot documentation) is working for almost all my pdfs'.