atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.62k stars 350 forks source link

Merging not working recursively #314

Closed CartierPierre closed 5 years ago

CartierPierre commented 5 years ago

I have an issue using more than 2 lines text bloc which it's not returning one bloc, see exemple :

This is a text on two lines

it works

returns

[This is a text\non two lines , it works]

but

This text is on three lines

returns

[This text\nis on , three lines]

instead of expected

[This text\nis on\nthree lines]

If i'm refering to the code, https://github.com/socialcopsdev/camelot/blob/7cf409aa08f937edd24d6ac14d8daa56e614bb6d/camelot/parsers/stream.py#L120

There is no update of the new group created. Is it possible to update it ? I can try to work on it, tell me.

CartierPierre commented 5 years ago

Ok, I make a pull request, I tried the new code and it works for me