Closed WilliamLo closed 8 years ago
@WilliamLo, have you tried to run the original Tesseract with your trained data and an image file? Pls, try it first just to confirm, that you've trained the Tesseract correctly. Thanks!
I have met the same issue, have u figured it out? @WilliamLo
@ws233 I don't know why i can't compile the trained data, at the end i use the language files from previous version. @sunwind2010 You can find the language files on source forge. (https://sourceforge.net/projects/tesseract-ocr-alt/files/)
I have downloaded the chi_sim.traineddata, and add it to the tessdata folder , the I modified this line:
G8RecognitionOperation *operation = [[G8RecognitionOperation alloc] initWithLanguage:@"eng+chi_sim"];
then I run the app and got crash
this is the crash information: 2016-06-15 14:28:27.519 Template Framework Project[2918:868549] Snapshotting a view that has not been rendered results in an empty snapshot. Ensure your view has been rendered at least once before snapshotting or snapshot after screen updates. Printing description of language->isa: __NSCFConstantString read_params_file: parameter not found: allow_blob_division
@WilliamLo thanks a lot! I will try again:)
@WilliamLo I tried this file "tessract-ocr-3.0.2.chi_sim.tar.gz" downloaded from the link, and added into the tessdata folder, but the recognition process took a long time and the accuracy is unbelievable, for instance: I just wrote two characters "天天",but the result is unreadable which is not even a character.
@WilliamLo like this: 一 一 入 ~ 一.~~瓤~一一一.._一一一一一一一一__一一一一〇一一一一一一一一_一一一一_一一一_一一一_一一一一一一一.一
… ..MWWMW
, r … 灬r v …. ~而 麒、F′~则 一一-.一,_一一,.
v. . .. A.me
@sunwind2010, @WilliamLo Tesseract-OCR-iOS
is just an iOS wrapper above the Tesseract
OCR engine. Pls, mention all you questions there. They will definitely help you.
I'm closing the issue, since it's not related to the iOS wrapper.
Sorry to bother u , thanks a lot!
Can it recognize Chinese?
I have tried to get the latest languages(chi_tra & chi_sim) from tessdata(https://github.com/tesseract-ocr/tessdata).
Also updated the init code to: *G8RecognitionOperation operation = [[G8RecognitionOperation alloc] initWithLanguage:@"eng+chi_tra+chi_sim"]; But the application crashes and i got below error: read_params_file: parameter not found: allow_blob_division**
Then i got the languages data from langdata(https://github.com/tesseract-ocr/langdata), commented "allow_blob_division F" in chi_tra.config and tried to compile. but it said "Fontconfig error: Cannot load default config file Could not find font named AR_PL_UKai_TW"
My script: sudo ./tesstrain.sh --lang chi_tra --fontlist 'AR_PL_UKai_TW' --fonts_dir /Users/williamlo/Library/Fonts --langdata_dir /Users/williamlo/Documents/langdata --tessdata_dir /Users/williamlo/Documents/tessdata --output_dir /Users/williamlo/Documents/langdata2
Anyone know what is the issue or how can i get the proper language files?