atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.62k stars 350 forks source link

Handling rows with missing boundaries #323

Closed sharun-s closed 5 years ago

sharun-s commented 5 years ago

This is more a question than a bug. Just wondering if there is a recommended way to handle rows where sometimes the row boundary line is absent between cells. Attaching an example. The issue is text from one cell then gets merged with the cell from the row above. In this example col 5 and 6 have various row boundaries missing. m

sharun-s commented 5 years ago

I found the shift_text=[''] in the docs which solved my problem so closing this issue.