atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.64k stars 354 forks source link

camelot merges pdf's columns #267

Closed asmiy closed 5 years ago

asmiy commented 5 years ago

Using read_pdf to extract tables from the pdf : 8.pdf, I got merged columns.

I use flavor='stream', table_areas =['..']. I tried column_tol ... but no difference. Could you help to fix this issue?

anakin87 commented 5 years ago

You can specify columns coordinates, using columns: https://camelot-py.readthedocs.io/en/master/user/advanced.html#specify-column-separators https://camelot-py.readthedocs.io/en/master/user/advanced.html#split-text-along-separators

vinayak-mehta commented 5 years ago

@asmiy Can you close this issue if the docs mentioned by @anakin87 solved your problem?

asmiy commented 5 years ago

Yes columns coordinates solved the problem. Thanks!