testGetUTF8Text failed, AssertionFailedError: "he\\o" != "hello"

GoogleCodeExporter commented 8 years ago

What steps will reproduce the problem?
1. Run TessBaseAPITest on a real device.

What is the expected output? What do you see instead?
The test is expected to be OK, with a right recognition of "hello".

What version of the product are you using? On what operating system?
The latest version of the project tesseract-android-tools from 
http://code.google.com/p/tesseract-android-tools. android-ndk-r8, target 
Android 2.3.3, a device at Android 2.3.5.

Please provide any additional information below.

testGetUTF8Text failed, the stack trace:

junit.framework.AssertionFailedError: "he\\o" != "hello"
at 
com.googlecode.tesseract.android.test.TessBaseAPITest.testGetUTF8Text(TessBaseAP
ITest.java:120)
at java.lang.reflect.Method.invokeNative(Native Method)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:169)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:154)
at 
android.test.InstrumentationTestRunner.onStart(InstrumentationTestRunner.java:52
9)
at 
android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1448)

Original issue reported on code.google.com by xyxzfj@gmail.com on 12 May 2012 at 11:41

GoogleCodeExporter commented 8 years ago

I've tried two versions of eng.traineddata. One is a 1.8MB one provided by 
someone in above issues, with a result of "he\\o" ; the other is 3MB, from 
http://code.google.com/p/tesseract-ocr/downloads/list, resuource 
"tesseract-3.01-doc-html.tar.gz ", with a result of "heHo".

Original comment by xyxzfj@gmail.com on 12 May 2012 at 12:14

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

Today I've tested testGetUTF8Text again, using the 2.96MB or 3MB version of 
eng.traineddata. The test against "hello" is surprisingly OK now (I haven't 
made any change of the code). But the test against "hello,world! 
0123456789/*-+" fails. Bellow is the stack trace:
junit.framework.AssertionFailedError: "he||0,w0r|d! 0123456789/<X--+" != 
"hello,world! 0123456789/*-+"
at 
com.googlecode.tesseract.android.test.TessBaseAPITest.testGetUTF8Text(TessBaseAP
ITest.java:123)
at java.lang.reflect.Method.invokeNative(Native Method)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:169)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:154)
at 
android.test.InstrumentationTestRunner.onStart(InstrumentationTestRunner.java:52
9)
at 
android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1448)

I've also attached a generated image file of string "hello,world! 
0123456789/*-+".

Original comment by xyxzfj@gmail.com on 15 May 2012 at 3:36

Attachments:

bmp.jpg

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

And,... something else: If I use "hello, world! 0123456789/*-+", note that 
there is a space after the comma now, then the recognized string is:"hello, 
world! 0123456789/<X--+". Pretty accurate now, although not perfect. Maybe my 
font is not distinguishable or not large enough.

Original comment by xyxzfj@gmail.com on 15 May 2012 at 6:49

GoogleCodeExporter commented 8 years ago

Test is stable using data file in updated README.

Original comment by alanv@google.com on 11 Sep 2012 at 8:38

Changed state: Fixed

jrcs1710 / tesseract-android-tools

testGetUTF8Text failed, AssertionFailedError: "he\\o" != "hello" #36