camelot-dev / camelot

A Python library to extract tabular data from PDFs
https://camelot-py.readthedocs.io
MIT License
2.77k stars 446 forks source link

Could not able to extract all columns #446

Open Amit49 opened 8 months ago

Amit49 commented 8 months ago

I am trying to extract the table using camelot but the extracted table appear to not contain all the columns.

I am using this line of code to extract the table camelot.read_pdf(pdf_file, flavor="stream", pages="all", column=["88,276,357,490"])

It is extracting 4 columns instead of 5 columns and merging 2nd and 3rd column together.

Is there any way to get better result for this type of pdf? sample pdf .pdf