patcharats / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Simple test image is recognized wrong #71

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Compile on Linux (Ubuntu 6.06.1)
2. Copy the english files to /usr/share/local/tessdata
3. export TESSDATA_PREFIX='/usr/local/share/'
4. Run tesseract on attached images

What is the expected output? What do you see instead?

The expected output for fnord.tif is 'fnord'  What I get instead: fri:

Running tesseract on TimeStamp.tif (which is converted from a JPEG) gives 
me: \t®\Iwl\lHMT1\\\\\\\\\\\\\\\\\W

Oddly enough, running tesseract on phototest.tif (also attached) it works 
great.

What version of the product are you using? On what operating system?
2.01 on Ubuntu 6.06.1, fully up-to-date

Original issue reported on code.google.com by pedah...@gmail.com on 2 Oct 2007 at 8:44

Attachments:

GoogleCodeExporter commented 9 years ago
Both the files fnord.tif and TimeStamp.tif are in 24-bit mode and
phototest.tif is in 8-bit mode.
Tesseract requires 8-bit more. So use convert the images to 8-bit.

Results on boths (8-bit) files with Tesseract

fnord.tif    : fnorcl
TimeStamp.tif: SEP 27, 2[1[17 [18:22

Original comment by adges...@gmail.com on 29 Oct 2007 at 7:08

GoogleCodeExporter commented 9 years ago
Try to convert your files 'fnord.tif' and 'TimeStamp.tif' to grayscale as
'phototest.tif'. I had same wrong results with RGB pictures and nice results 
after
converting them to grayscale. ( i noticed this difference in your files by 
'tiffinfo'
command )

Original comment by filip.ma...@gmail.com on 24 Apr 2008 at 7:46

GoogleCodeExporter commented 9 years ago
Fixed in 2.04 and current svn.

Original comment by theraysm...@gmail.com on 28 Dec 2008 at 7:29

GoogleCodeExporter commented 9 years ago
This still seems to be a problem on Debian 5.0 and the current head revision in 
trunk.

Original comment by clj2...@gmail.com on 7 Apr 2009 at 1:13