tabulapdf / tabula-web-java

Tabula is a tool for liberating data tables trapped inside PDF files
http://tabula.technology
MIT License
11 stars 7 forks source link

Table detection in images #10

Open saanvib13 opened 5 months ago

saanvib13 commented 5 months ago

I have a couple of images and I want to detect tables in them. I am using the tabula library for the same. Since tabula works with pdfs, I am converting a set images to a pdf and then executing the tabula library on it. However, the library is unable to detect the tables in the converted pdf. What could be the reason behind it and are there any other alternatives to make this work? I also tried using the paddleocr library but it does not natively support batch processing. What can be done to process multiple images simultaneously on the GPU so as to detect tables in them?