xavctn / img2table

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
MIT License
528 stars 74 forks source link

pdf to excel - columns got merged #151

Open nitinguptaai2022 opened 9 months ago

nitinguptaai2022 commented 9 months ago

Hi, First of all thank you for this library. It is really great and can be awesome if some improvements can be done. So I am creating 2 issues, Below is 1 of 2 -

I created simple pdf as attached but columns got merged. This pdf I converted from excel sheet. All there are attached (Original excel, PDF from original excel and output excel) Original.xlsx Output.xlsx PDF_From_Original.pdf

xavctn commented 8 months ago

Hello, Thanks for the feedback. I am aware of this issue whenever columns are not greatly separated. I will try to work on it in upcoming updates.