nam-tran-niteco / tesseract-android-tools

Automatically exported from code.google.com/p/tesseract-android-tools
1 stars 0 forks source link

Problem : Using Hindi Traindata for Tesseract OCR 3.00 #17

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I am using tesseract 3.00  .I was able to use it correctly with english 
langauge traindata but not with any indian local langauges such as hindi and 
tamil etc. 
I was getting the error at

actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed in 
file tessdatamanager.cpp line 55  (Segmentation Fault).

This error generally happens when the traindata is of different version that 
tessearct but i am using same version 3.00 hindi traindata with tesseract 3.00 
and still getting this error.

Exact problem  shwon in my logcat is 

09-13 06:10:50.447: INFO/DEBUG(31): *** *** *** *** *** *** *** *** *** *** *** 
*** *** *** *** ***
09-13 06:10:50.447: INFO/DEBUG(31): Build fingerprint: 
'generic/sdk/generic/:2.2/FRF91/43546:eng/test-keys'
09-13 06:10:50.447: INFO/DEBUG(31): pid: 278, tid: 278  >>> com.artoo.ocr.test 
<<<
09-13 06:10:50.454: INFO/DEBUG(31): signal 11 (SIGSEGV), fault addr 00000000
09-13 06:10:50.454: INFO/DEBUG(31):  r0 000000c4  r1 afd40328  r2 00000003  r3 
00000000
09-13 06:10:50.454: INFO/DEBUG(31):  r4 80d870ec  r5 bec5a374  r6 00000002  r7 
bec5a437
09-13 06:10:50.454: INFO/DEBUG(31):  r8 bec5a3a9  r9 80d5149c  10 0000048c  fp 
00000000
09-13 06:10:50.454: INFO/DEBUG(31):  ip 00000000  sp bec5a370  lr afd16a35  pc 
80c65f64  cpsr 60000030
09-13 06:10:50.544: INFO/DEBUG(31):          #00  pc 00065f64  
/data/data/com.artoo.ocr.test/lib/libtess.so
09-13 06:10:50.544: INFO/DEBUG(31):          #01  pc 0006989a  
/data/data/com.artoo.ocr.test/lib/libtess.so

I am using ubuntu as OS.

Looking forward to hear back from anyone.

Regards,
Ashish sharma

Original issue reported on code.google.com by ashishsh...@gmail.com on 19 Sep 2011 at 5:43

GoogleCodeExporter commented 8 years ago
Obsolete, moved to Tesseract 3.01 which has a new data file format.

Original comment by alanv@google.com on 11 Sep 2012 at 8:25