khjde1207 / tesseract_ocr

Tesseract OCR for flutter
BSD 3-Clause "New" or "Revised" License
64 stars 31 forks source link

OCR fails miserably when trying to scan anything (Android) #18

Closed TheLastGimbus closed 2 years ago

TheLastGimbus commented 3 years ago

After my issues with Java exceptions (#13) and re-using tesseract (#4) got resolved, I went ahead and ran flutter pub upgrade flutter_tesseract_ocr - which pointed it to use d5cd0ac005aa18bfe3cf2ec1f92ae333f50d1387

But now, when I run any scan (FlutterTesseractOcr.extractText()), I get this gigantic terryfing C++ panic:

I/Tesseract(native)(19356): Initialized Tesseract API with language=eng
F/libc    (19356): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 19498 (Thread-5), pid 19356 (e.dotmeme.debug)
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
LineageOS Version: '17.1-20210925-NIGHTLY-sailfish'
Build fingerprint: 'google/sailfish/sailfish:8.1.0/OPM1.171019.021/4565141:user/release-keys'
Revision: '0'
ABI: 'arm64'
Timestamp: 2021-10-02 14:50:05+0200
pid: 19356, tid: 19498, name: Thread-5  >>> com.example.myapp.debug <<<
uid: 10282
signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
    x0  0000000000000000  x1  0000000000004c2a  x2  0000000000000006  x3  00000070efb74b90
    x4  fefeff09302f301f  x5  fefeff09302f301f  x6  fefeff09302f301f  x7  7f7f7f7f7f7f7f7f
    x8  00000000000000f0  x9  e0ecaa9908742a40  x10 fffffff0fffffbdf  x11 0000000000000000
    x12 00000070efb74c50  x13 0000000000000010  x14 000000723ed1d40a  x15 00000070efb74c50
    x16 000000723ed238b8  x17 000000723ed00a20  x18 00000070ef27c000  x19 00000000000000ac
    x20 0000000000004b9c  x21 00000000000000b2  x22 0000000000004c2a  x23 00000000ffffffff
    x24 00000070efb77020  x25 00000070efb75590  x26 0000000000000000  x27 0000000000000000
    x28 00000070efb77020  x29 00000070efb74c40
    sp  00000070efb74b70  lr  000000723ecb244c  pc  000000723ecb246c
backtrace:
      #00 pc 000000000008246c  /apex/com.android.runtime/lib64/bionic/libc.so (abort+160) (BuildId: 8075b859a71b22c56a77e3c06a01d27d)
      #01 pc 000000000018e070  /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/lib/arm64/libtesseract.so (ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const+376) (BuildId: 0cacc3ca42dd4e1a324de02484b5e6450c729bc9)
      #02 pc 0000000000119ccc  /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/lib/arm64/libtesseract.so (tesseract::Tesseract::SegmentPage(STRING const*, BLOCK_LIST*, tesseract::Tesseract*, OSResults*)+136) (BuildId: 0cacc3ca42dd4e1a324de02484b5e6450c729bc9)
      #03 pc 00000000000e2388  /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/lib/arm64/libtesseract.so (tesseract::TessBaseAPI::FindLines()+652) (BuildId: 0cacc3ca42dd4e1a324de02484b5e6450c729bc9)
      #04 pc 00000000000e2864  /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/lib/arm64/libtesseract.so (tesseract::TessBaseAPI::Recognize(ETEXT_DESC*)+56) (BuildId: 0cacc3ca42dd4e1a324de02484b5e6450c729bc9)
      #05 pc 00000000000e163c  /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/lib/arm64/libtesseract.so (tesseract::TessBaseAPI::GetUTF8Text()+60) (BuildId: 0cacc3ca42dd4e1a324de02484b5e6450c729bc9)
      #06 pc 00000000002a4038  /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/lib/arm64/libtesseract.so (Java_com_googlecode_tesseract_android_TessBaseAPI_nativeGetUTF8Text+64) (BuildId: 0cacc3ca42dd4e1a324de02484b5e6450c729bc9)
      #07 pc 000000000013f350  /apex/com.android.runtime/lib64/libart.so (art_quick_generic_jni_trampoline+144) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #08 pc 0000000000136334  /apex/com.android.runtime/lib64/libart.so (art_quick_invoke_stub+548) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #09 pc 0000000000145064  /apex/com.android.runtime/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+244) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #10 pc 00000000002e343c  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x2ac000) (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+384) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #11 pc 00000000002de334  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x2ac000) (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+928) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #12 pc 00000000005a4334  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x4b7000) (MterpInvokeDirect+400) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #13 pc 0000000000130914  /apex/com.android.runtime/lib64/libart.so (mterp_op_invoke_direct+20) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #14 pc 0000000000300dd4  [anon:dalvik-classes.dex extracted in memory from /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/base.apk] (com.googlecode.tesseract.android.TessBaseAPI.getUTF8Text+12)
      #15 pc 00000000005a227c  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x4b7000) (MterpInvokeVirtual+1456) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #16 pc 0000000000130814  /apex/com.android.runtime/lib64/libart.so (mterp_op_invoke_virtual+20) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #17 pc 000000000032d1d2  [anon:dalvik-classes.dex extracted in memory from /data/app/com.example.myapp.debug-kcCOQ9nmqoiOwHZShpmypw==/base.apk] (io.paratoner.flutter_tesseract_ocr.MyRunnable.run+54)
      #18 pc 00000000005a3a9c  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x4b7000) (MterpInvokeInterface+1764) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #19 pc 0000000000130a14  /apex/com.android.runtime/lib64/libart.so (mterp_op_invoke_interface+20) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #20 pc 00000000000e4a6c  /apex/com.android.runtime/javalib/core-oj.jar (java.lang.Thread.run+8)
      #21 pc 00000000002b4380  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x2ac000) (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEbb.llvm.16107106113732464516+240) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #22 pc 00000000005934a4  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x4b7000) (artQuickToInterpreterBridge+944) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #23 pc 000000000013f468  /apex/com.android.runtime/lib64/libart.so (art_quick_to_interpreter_bridge+88) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #24 pc 0000000000136334  /apex/com.android.runtime/lib64/libart.so (art_quick_invoke_stub+548) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #25 pc 0000000000145064  /apex/com.android.runtime/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+244) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #26 pc 00000000004b1938  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x3ef000) (art::(anonymous namespace)::InvokeWithArgArray(art::ScopedObjectAccessAlreadyRunnable const&, art::ArtMethod*, art::(anonymous namespace)::ArgArray*, art::JValue*, char const*)+104) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #27 pc 00000000004b2a08  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x3ef000) (art::InvokeVirtualOrInterfaceWithJValues(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, _jmethodID*, jvalue const*)+416) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #28 pc 00000000004f2f70  /apex/com.android.runtime/lib64/libart.so!libart.so (offset 0x4b7000) (art::Thread::CreateCallback(void*)+1172) (BuildId: bbe36f631af1f00437df5aee32be1bc0)
      #29 pc 00000000000e4a20  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+36) (BuildId: 8075b859a71b22c56a77e3c06a01d27d)
      #30 pc 0000000000084004  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 8075b859a71b22c56a77e3c06a01d27d)

Lost connection to device.

..and the whole app crashes (obviously)

khjde1207 commented 2 years ago

I also get an error. I can't find a solution.

khjde1207 commented 2 years ago

I found the cause. "https://github.com/tesseract-ocr/tessdata/raw/master/${leng}.traineddata" => "https://github.com/tesseract-ocr/tessdata/raw/main/${leng}.traineddata" I will edit the url.

BernardinD commented 2 years ago

I'm getting the same error when I try to run extractText() inside Future.foreach(). I have the trained-data saved locally for faster runtime, so I'm not sure what the issue is. I would think that reading the data is thread-safe