AiPacino / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
2 stars 0 forks source link

Table not detected correctly #1118

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hello,

I have an image which contain table, with single row and multiple column. Each 
column can have one or more alphabets and numbers. Some of the column have only 
single special character.

When I use Tesseract for OCR, I am not getting desired result. Tesserct doesn't 
recognize character/special symbol from first and last column. It only 
recognize numbers from the second column.

I have attached the input image file.

Please suggest any solution for this.

Original issue reported on code.google.com by tempname...@gmail.com on 18 Feb 2014 at 5:39

Attachments:

GoogleCodeExporter commented 9 years ago
Hello,

I want to update the thread. 
When I added some text manually above the image, I get the correct result. The 
text above the image and text inside the table is been recognized correctly.

I have attached the image for which I got correct result. 

But have not found the reason why tesseract couldn't recognize the first image 
which contain only table. I suspect its because of the page border analysis 
done by tesseract in preprocessing phase. I think there must be some parameter 
to force tesseract to ignore page border analysis. Please suggest any solution.

Thanks.

Original comment by tempname...@gmail.com on 27 Feb 2014 at 4:49

Attachments: