tleyden / open-ocr

Run your own OCR-as-a-Service using Tesseract and Docker
Apache License 2.0
1.33k stars 223 forks source link

Cannot loading Chinese language #58

Closed an0 closed 8 years ago

an0 commented 8 years ago

Try passing: "engine_args": {"lang": "chi-sim"} to the REST API. You should see this:

openocrworker_1 | 02:50:41.532795 OCR_WORKER: got 255962 byte delivery: [1]. Routing key: decode-ocr Reply to: amq.gen-kOrQgLhd316ODB9hX370sg openocrworker_1 | 02:50:41.540446 OCR_TESSERACT: cmdArgs: [/tmp/2c9ed9c6-81d3-4a86-76e0-bb84d99cdbb2 /tmp/2c9ed9c6-81d3-4a86-76e0-bb84d99cdbb2 -l chi-sim] openocrworker_1 | 02:50:41.546023 OCR_TESSERACT: Error exec tesseract: exit status 1 Tesseract Open Source OCR Engine v3.03 with Leptonica openocrworker_1 | Error opening data file /usr/share/tesseract-ocr/tessdata/chi-sim.traineddata openocrworker_1 | Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. openocrworker_1 | Failed loading language 'chi-sim' openocrworker_1 | Tesseract couldn't load any languages! openocrworker_1 | Could not initialize tesseract. openocrworker_1 | 02:50:41.546128 ERROR: Error processing image url: . Error: exit status 1 -- open-ocr.(_OcrRpcWorker).resultForDelivery() at ocr_rpc_worker.go:182 openocrworker_1 | 02:50:41.546143 ERROR: Error generating ocr result. Error: exit status 1 -- open-ocr.(_OcrRpcWorker).handle() at ocr_rpc_worker.go:144 openocrworker_1 | 02:50:41.546153 OCR_WORKER: Sending rpc response: {Error processing image url: . Error: exit status 1} openocrworker_1 | 02:50:41.546168 OCR_WORKER: sendRpcResponse to: amq.gen-kOrQgLhd316ODB9hX370sg openocr_1 | 02:50:41.546578 OCR_CLIENT: got 51B delivery: [1] "Error processing image url: . Error: exit status 1". Reply to: openocr_1 | 02:50:41.546597 OCR_CLIENT: send result to rpcResponseChan openocr_1 | 02:50:41.546602 OCR_CLIENT: sent result to rpcResponseChan openocr_1 | 02:50:41.547511 OCR_HTTP: ocrResult: {Error processing image url: . Error: exit status 1} openocrworker_1 | 02:50:41.548791 OCR_WORKER: sendRpcResponse succeeded

tleyden commented 8 years ago

Hmm, I checked the dockerfile and it has chi-sim:

https://github.com/tleyden/docker/blob/master/open-ocr/Dockerfile#L45

tleyden commented 8 years ago

What happens if you remove the lang arg?

"engine_args": {} does it work in that case?

an0 commented 8 years ago

Yes, it works for “eng”. I checked that too. Did you try "chi-sim" yourself to see whether it works for you?

tleyden commented 8 years ago

Nope, I haven't actually tried chi-sim yet

lyhanyz commented 8 years ago

using chi_sim replace chi-sim

tleyden commented 8 years ago

@an0 can you try chi_sim? Re-open the ticket if that doesn't fix it.

xinerd commented 7 years ago

tried chi_sim, solved problem. thanks