xavctn / img2table

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
MIT License
562 stars 75 forks source link

pdf to excel - columns got merged #151

Open nitinguptaai2022 opened 10 months ago

nitinguptaai2022 commented 10 months ago

Hi, First of all thank you for this library. It is really great and can be awesome if some improvements can be done. So I am creating 2 issues, Below is 1 of 2 -

I created simple pdf as attached but columns got merged. This pdf I converted from excel sheet. All there are attached (Original excel, PDF from original excel and output excel) Original.xlsx Output.xlsx PDF_From_Original.pdf

xavctn commented 10 months ago

Hello, Thanks for the feedback. I am aware of this issue whenever columns are not greatly separated. I will try to work on it in upcoming updates.