michaelethompson / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Devanagari - Totally wrong recognition for some limes #1343

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. run tesseract with devanagari traineddata with attached image
2.
3.

What is the expected output? What do you see instead?
Line 13 (second line of verse 58) has totally wrong recognition

What version of the product are you using? On what operating system?
latest version from git, under msys2/mingw32 on windows8

Please provide any additional information below.
tried with psm 3 and 4, 
both give wrong recognition

input file and recognized text files for psm 3 and 4 attached.

Original issue reported on code.google.com by shreeshrii on 14 Oct 2014 at 4:10

Attachments:

GoogleCodeExporter commented 9 years ago
Another sample, where the same word is recognized correctly one line below and 
one line above, but is totally misreconized in the line in-between.
.tif dile was given innput to tesseract
.png file highlights the line in error

Original comment by shreeshrii on 31 Oct 2014 at 9:31

Attachments:

GoogleCodeExporter commented 9 years ago
tried with different psm settings -
works ok with psm 6 but not with psm 3 or 4

output files attached.

Original comment by shreeshrii on 31 Oct 2014 at 11:47

Attachments: