AmitGorvadiya / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Is there a minimum number of characters required for results? Varying results on 2.04, tessdll, and svn version from 7.30.09 #227

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
Run tesseract 2.04 (installed on 7.3.09) executable, dlltest.exe (from 2.04 
download), and the SVN version of tesseract from 7.30.09 on the attached 
image. 

What is the expected output? What do you see instead?
The expected output is '305'.  This is the result given correctly by the 
2.04 release version executable.  However the dlltest outputs a single 
character "~" as well as box locations that do not make sense.  The latest 
SVN version outputs "blank page" in command prompt and outputs a blank .txt 
file.

What version of the product are you using? On what operating system?
Versions listed above (2.04, 7.30.09 SVN, and dlltest with tessdll from the 
2.04 version).  Windows Vista OS.

Please provide any additional information below.
Is there a minimum number of characters variable somewhere?  When I copy 
and paste the '305' image into an image with the first two words "This is" 
from phototest.tif (see phototest3.tif in attachments), the dlltest no 
longer outputs just a ~, 
3[33]->[33](10,51)->(37,11)
¤[c2][a4]->[a4](37,52)->(61,11)
$[24]->[24](60,51)->(88,10)
<nl>

T[54]->[54](9,109)->(30,84)
h[68]->[68](30,109)->(46,84)
i[69]->[69](49,109)->(54,84)
s[73]->[73](55,109)->(71,90)

i[69]->[69](82,109)->(87,84)
s[73]->[73](88,109)->(104,90)
<para>

And the SVN version outputs correctly:
305
This is

Original issue reported on code.google.com by bjwim...@gmail.com on 1 Aug 2009 at 1:27

Attachments:

GoogleCodeExporter commented 9 years ago
On 3.00 try using api.SetPageSegMode(PSM_SINGLE_WORD); after api.Init()
See api/tesseractmain.cpp.
It might not help with dlltest, but it will help with the command-line version. 
The 
dlltest version has a slightly different reject mechanism.

Original comment by theraysm...@gmail.com on 11 Aug 2009 at 8:39

GoogleCodeExporter commented 9 years ago

Original comment by theraysm...@gmail.com on 19 May 2010 at 11:13

GoogleCodeExporter commented 9 years ago
by the way, thanks for your help and answer.  :)

Original comment by bjwim...@gmail.com on 19 May 2010 at 11:28