atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.61k stars 349 forks source link

Some Tables not extracted from a Multi Page Multi Table PDF #464

Open kavitasherla opened 2 years ago

kavitasherla commented 2 years ago

I tried to extract table data from a Multi page Multi Table PDF using following code

import camelot tables = camelot.read_pdf('InputPDF.pdf',flavor='stream',multiple_tables=True,pages='all') tables.export('foo1.csv', f='csv', compress=True) # json, excel, html

enter image description here But the 4,5 tables in Page 2 not extracted. same type of tables extracted in other pages properly

Attached the PDF file image which I tried as an example

There is no ERROR shown, TestPDF.pdf

I tried with "Line_scale=125" another values too