rmtheis / tess-two

Fork of Tesseract Tools for Android
Apache License 2.0
3.76k stars 1.38k forks source link

need help for tesseract 4 #198

Closed amin1985 closed 7 years ago

amin1985 commented 7 years ago

it is almost 4 month that tesseract 4 is released and i really need version 4 on android so why not work together and make it working?any body interested?

i have tried to compile tesseract 4 based on your project if i set engine mode to 0,2,3,4,5 it works perfectly with old engine mode (version 4 traineddata files) but if i set the engine mod to OEM_LSTM_ONLY (1) it crashes with the following error "fatal signal 11 (sigsegv), code 1, fault addr 0x0 in tid 22027" in this code line "tessapi.getUTF8Text();" if the input image is in sold color it doesn't crash but if there is a pixel different from others =>crash

so i don't know if its leptonica failure or tesseract if its leptonica failure, why it works with other engines mode? the other engine modes don't use leptonica?!

i have tried clean black and white image(screenshot of text in pc) and bitmap.config of input image but useless

I'm new on ndk and it is hard to debug complex projects like tesseract with android studio the other solutions are visual studio and visualgdb but i have failed to use them

does any one tried debug native apk with visual studio and help me?

i have tried to debug it with android studio here is the result


i had faced this error

SIGSEGV (signal SIGSEGV: invalid address (fault address: 0x0))

from this line of code in mainactivity.java

txt = tessapi.getUTF8Text();

i followed the error (tried debug with "__android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "some text or values here");" ) if the log stop at any point it means some thing wrong

in baseapi.cpp

char* TessBaseAPI::GetUTF8Text() { __android_log_print(ANDROID_LOGVERBOSE, APPNAME, "stage 0"); if (tesseract == NULL || (!recognitiondone && Recognize(NULL) < 0))//////////////////////error { __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "stage 1.5 null"); return NULL; }

==>

int TessBaseAPI::Recognize(ETEXT_DESC* monitor) {///////////////////error

==>

if (tesseract_->recog_all_words(pageres, monitor, NULL, NULL, 0)) {//////////////error



   ==>
  **now controll.cpp**

      // Run pass 1 word recognition.
    if (!RecogAllWordsPassN(1, monitor, &page_res_it, &words))
    {

==>

     classify_word_and_language(pass_n, pr_it, word);

     ==>

most_recently_used_->RetryWithLanguage(
      *word_data, recognizer, debug, &word_data->lang_words[sub], &best_words);

==>

**at the end it seems this code produce that Error**

__android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "RetryWithLanguage 3");
(this->*recognizer)(word_data, in_word, &new_words);

==>recognizer is WordRecognizer

typedef void (Tesseract::*WordRecognizer)(const WordData& word_data,
                                          WERD_RES** in_word,
                                          PointerVector<WERD_RES>* out_words);

the last logcat text was "RetryWithLanguage 3"  so this code (this->*recognizer)(word_data, in_word, &new_words);                                     
suposed to produce error but 

then on accident a found out below code runs some how and trigered from (this->*recognizer)(word_data, in_word, &new_words);    or maybe its asynch or logcat...                                  

void Tesseract::match_word_pass_n(int pass_n, WERD_RES *word,
                                  ROW *row, BLOCK* block) {

and it runs 
void Tesseract::classify_word_pass1(const WordData& word_data,
                                    WERD_RES** in_word,
                                    PointerVector<WERD_RES>* out_words) {
                                ... 

 match_word_pass_n(1, word, row, block);/////// error

 void Tesseract::match_word_pass_n(int pass_n, WERD_RES *word,
                                  ROW *row, BLOCK* block) {

    tess_segment_pass_n(pass_n, word);///error

     it seems the error is go on for ever

    there is some thing that maybe produce this error on void Tesseract::classify_word_pass1(const WordData& word_data,
                                    WERD_RES** in_word,
                                    PointerVector<WERD_RES>* out_words) {

/////////
we have this part that do not include in android build and i dont know why they have removed it (maybe lstm is not compatible with android yet!)
/////////

    #ifndef ANDROID_BUILD
      if (tessedit_ocr_engine_mode == OEM_LSTM_ONLY ||
      tessedit_ocr_engine_mode == OEM_TESSERACT_LSTM_COMBINED) {
    if (!(*in_word)->odd_size || tessedit_ocr_engine_mode == OEM_LSTM_ONLY) {
      LSTMRecognizeWord(*block, row, *in_word, out_words);
      if (!out_words->empty())
        return;  // Successful lstm recognition.
    }
    if (tessedit_ocr_engine_mode == OEM_LSTM_ONLY) {
      // No fallback allowed, so use a fake.
      (*in_word)->SetupFake(lstm_recognizer_->GetUnicharset());
      return;
    }
    // Fall back to tesseract for failed words or odd words.
    (*in_word)->SetupForRecognition(unicharset, this, BestPix(),
                                    OEM_TESSERACT_ONLY, NULL,
                                    classify_bln_numeric_mode,
                                    textord_use_cjk_fp_model,
                                    poly_allow_detailed_fx, row, block);
  }
#endif