Chenhx / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

dlltest.exe seems not accurate as well as tesseract.exe ? #278

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. cut a word from a image(e.g. "environment")
2. recognize it by dlltest.exe
3. the answer is not very accurate ("cnviromncnt). But the tesseract.exe 
works well.

What is the expected output? What do you see instead?
expected output is "environment".  But we can see "cnviromncnt" .

What version of the product are you using? On what operating system?
2.04 , winXP

Please provide any additional information below.
  We need not only txt file but also the position(coordinates)of each 
word. So that we can find the word wo really want.  The dlltest.exe just 
can output the informations.
  How can we improve it? 
  Or, we can output the boxes coordinates by tesseract.exe too?

  Please give some suggestions to us, thank you VERY MUCH !!

Original issue reported on code.google.com by zor...@163.com on 25 Jan 2010 at 8:43

GoogleCodeExporter commented 9 years ago
The tessdll api is deprecated. Use TessBaseAPI in 3.00.
If you want word boxes, you can use the newly added hOCR output. See 
GetHOCRText.

Original comment by theraysm...@gmail.com on 20 May 2010 at 3:48

GoogleCodeExporter commented 9 years ago
Issue 277 has been merged into this issue.

Original comment by theraysm...@gmail.com on 20 May 2010 at 3:48