xavctn / img2table

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
MIT License
571 stars 76 forks source link

tables are not available as pandas dataframe. #193

Open aiusrgit opened 7 months ago

aiusrgit commented 7 months ago

when using .df attribute on the ExtractedTable instance , the output pandas dataframe contains null cell values and no information at all.

yashtodi94 commented 6 months ago

Did you pass an ocr in .extract_tables(ocr=...)?

From the docs:

ocr : OCRInstance, optional, default None OCR instance used to parse document text. If None, cells content will not be extracted