atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.64k stars 354 forks source link

Multiple table from single page #260

Closed chiragob closed 5 years ago

chiragob commented 5 years ago

i am trying to extract pdf which have multiple table on single page but camelot only extract one table. is there any way to extract multiple table on single page. i attached pdf file.

2.pdf

import camelot tables = camelot.read_pdf('2.pdf',flavor='lattice') print(tables[0].parsing_report) print(len(tables))

please help

anakin87 commented 5 years ago

Try this: tables=camelot.read_pdf('2.pdf',line_scale=80)

The larger the line_scale, the smaller the size of lines getting detected

chiragob commented 5 years ago

thanks @anakin87

vinayak-mehta commented 5 years ago

@anakin87 Thanks for pointing it out!