atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.65k stars 355 forks source link

Report A Bug in 'table_regions' #461

Open Yichen0975 opened 3 years ago

Yichen0975 commented 3 years ago

Dear Sir,

I think there is a bug in table_regions.

I try to use the table-regions parameters to extract parts of a table from a pdf since the table is complex since extracting parts of it is easier than extracting the whole table and then modifying it.

However, when I try to change the related parameters (I'm pretty sure that I use the right x1, y1, x2, y2), I got 2 results: a. the whole table; b. or "ZeroDivisionError: float division by zero".

After checking my codes, again and again, I'm sure that there is no mistake in my code.

Best regards, Yichen Zhao

ckcr4lyf commented 3 years ago

We're facing something similar - unit tests which involve parsing the PDF work fine locally (Ubuntu based environment w/ python 3.8.3) - but are failing in our docker images.

Troubleshooting Docker right now..

EDIT: Turns out problem was 0.9.0, while local was still on 0.8.2. Version locking to 0.8.2 fixes this.

Yichen0975 commented 3 years ago

We're facing something similar - unit tests which involve parsing the PDF work fine locally (Ubuntu based environment w/ python 3.8.3) - but are failing in our docker images.

Troubleshooting Docker right now..

EDIT: Turns out problem was 0.9.0, while local was still on 0.8.2. Version locking to 0.8.2 fixes this.

Dear,

Sorry, it's my first time using Github so that I submit 2 bug reports. Thanks for your reply and your suggestions though it could not work since I have double checked my Camelot varsity that is 0.8.2.

Best regards, Yichen