atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.61k stars 349 forks source link

split_text IndexError fix #475

Closed Yakov-Varnaev closed 1 year ago

Yakov-Varnaev commented 2 years ago

Some files throws IndexError while reading file via read_pdf with split_text set to True. File example: https://disk.yandex.ru/i/sMcNUwU4VoOENQ https://github.com/atlanhq/camelot/issues/443

The approach is quite straightforward but works in my cases