rmtheis / tess-two

Fork of Tesseract Tools for Android
Apache License 2.0
3.76k stars 1.38k forks source link

Native crush when 'vert'.tessdata is used #263

Closed air-hedgehog closed 5 years ago

air-hedgehog commented 5 years ago

Summary: when trying to use one of the following:

native crash occurs with following backtrace:

` Build fingerprint: 'xiaomi/mido/mido:7.0/NRD90M/V10.2.3.0.NCFMIXM:user/release-keys' Revision: '0' ABI: 'arm64' pid: 29296, tid: 29655, name: DefaultDispatch >>> com.akimchenko.antony.mediocr <<< signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xc x0 0000000000000000 x1 0000000000000000 x2 0000000000000000 x3 0000007f8934d4e0 x4 0000000000000000 x5 0000000000000000 x6 0000000000000000 x7 0000000000000000 x8 7f7f7f7f7f7f7f7f x9 00000000007959e8 x10 0101010101010101 x11 0000000000000020 x12 0000007f79b4ca88 x13 0000000000000020 x14 000000000000000c x15 2e8ba2e8ba2e8ba3 x16 0000007f7519fa00 x17 0000007f9973439c x18 000000000000000f x19 0000007f763072f0 x20 0000007f76131000 x21 0000007f751a0000 x22 00000000ffffffff x23 0000007f79b69f80 x24 0000007f76133000 x25 a4ad591e860ad059 x26 0000007f763074e0 x27 a4ad591e860ad059 x28 0000000000000001 x29 0000007f76307608 x30 0000007f74ed4b88 sp 0000007f76307240 pc 0000007f74ed4d64 pstate 0000000060000000

backtrace:

00 pc 00000000000a1d64 /data/app/com.akimchenko.antony.mediocr-2/lib/arm64/libtess.so (_ZN9tesseract9Tesseract15recog_all_wordsEP8PAGE_RESP10ETEXT_DESCPK4TBOXPKci+1304)

#01 pc 000000000008ccf4  /data/app/com.akimchenko.antony.mediocr-2/lib/arm64/libtess.so (_ZN9tesseract11TessBaseAPI9RecognizeEP10ETEXT_DESC+632)
#02 pc 000000000008dcb0  /data/app/com.akimchenko.antony.mediocr-2/lib/arm64/libtess.so (_ZN9tesseract11TessBaseAPI11GetUTF8TextEP10ETEXT_DESC+312)
#03 pc 000000000022e858  /data/app/com.akimchenko.antony.mediocr-2/lib/arm64/libtess.so (Java_com_googlecode_tesseract_android_TessBaseAPI_nativeGetUTF8Text+148)
#04 pc 00000000008963f0  /data/app/com.akimchenko.antony.mediocr-2/oat/arm64/base.odex (offset 0x862000)`

Steps to reproduce the issue:

  1. Choose any "_vert.tessdata" language
  2. try to recognize symbols of any input picture

Expected result: recognized vertically aligned text

Actual result: crash

Tess-two version: tested on 5.4.1 and 9.0.0

Android version: 7.0

Phone/device model: Xiaomi Redmi NOTE 4

Phone/device architecture: arm64

Link to training data used: https://github.com/tesseract-ocr/tessdata

rmtheis commented 5 years ago

The *vert trained data files are for Tesseract 4.0, which isn't yet supported (see #196). This project currently only works with the trained data files linked in the pre-requisites section.