atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.62k stars 350 forks source link

Table_regions #312

Closed eldarvagapov closed 5 years ago

eldarvagapov commented 5 years ago

table_regions kwarg throws an ValueError: too many values to unpack (expected 4)

tables = camelot.read_pdf('sberbank_statement.pdf', flavor='stream', pages='1-end', table_regions=['168,145,568,740'])

where 168 = Left, 145 = Top 568 = Left + Width 740 = Top + Height

`/anaconda3/lib/python3.7/site-packages/camelot/parsers/stream.py in _generate_table_bbox(self) 287 hor_text = [] 288 for region in self.table_regions: --> 289 x1, y1, x2, y2 = region 290 region_text = text_in_bbox((x1, y2, x2, y1), self.horizontal_text) 291 hor_text.extend(region_text)

ValueError: too many values to unpack (expected 4)`

anakin87 commented 5 years ago

I get the same error. Don't know why, but with flavor='lattice', I don't get this error.

joeyjustcuz commented 5 years ago

im getting this error also

scamelot\parsers\stream.py line 289

my code:

tables = cm.read_pdf(file, pages='1',flavor='stream', table_regions=['9,8,7,6']) valueError: too many values to unpack (expected 4)

vinayak-mehta commented 5 years ago

I know why this is happening. From the traceback, the region string is not being split like an area is being split (you can see it not too down below). We just need to convert it to region.split(','). I'll push a fix tonight.

eldarvagapov commented 5 years ago

@vinayak-mehta - did you manage to push that fix in the end?

vinayak-mehta commented 5 years ago

I remember creating a branch and adding that fix, but it clearly isn't in master or in the PRs. Let me push it today.

vinayak-mehta commented 5 years ago

Thanks for the bug report @eldarvagapov!