Closed GoogleCodeExporter closed 8 years ago
I've tried two versions of eng.traineddata. One is a 1.8MB one provided by
someone in above issues, with a result of "he\\o" ; the other is 3MB, from
http://code.google.com/p/tesseract-ocr/downloads/list, resuource
"tesseract-3.01-doc-html.tar.gz ", with a result of "heHo".
Original comment by xyxzfj@gmail.com
on 12 May 2012 at 12:14
[deleted comment]
[deleted comment]
Today I've tested testGetUTF8Text again, using the 2.96MB or 3MB version of
eng.traineddata. The test against "hello" is surprisingly OK now (I haven't
made any change of the code). But the test against "hello,world!
0123456789/*-+" fails. Bellow is the stack trace:
junit.framework.AssertionFailedError: "he||0,w0r|d! 0123456789/<X--+" !=
"hello,world! 0123456789/*-+"
at
com.googlecode.tesseract.android.test.TessBaseAPITest.testGetUTF8Text(TessBaseAP
ITest.java:123)
at java.lang.reflect.Method.invokeNative(Native Method)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:169)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:154)
at
android.test.InstrumentationTestRunner.onStart(InstrumentationTestRunner.java:52
9)
at
android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1448)
I've also attached a generated image file of string "hello,world!
0123456789/*-+".
Original comment by xyxzfj@gmail.com
on 15 May 2012 at 3:36
Attachments:
[deleted comment]
And,... something else: If I use "hello, world! 0123456789/*-+", note that
there is a space after the comma now, then the recognized string is:"hello,
world! 0123456789/<X--+". Pretty accurate now, although not perfect. Maybe my
font is not distinguishable or not large enough.
Original comment by xyxzfj@gmail.com
on 15 May 2012 at 6:49
Test is stable using data file in updated README.
Original comment by alanv@google.com
on 11 Sep 2012 at 8:38
Original issue reported on code.google.com by
xyxzfj@gmail.com
on 12 May 2012 at 11:41