tleyden / open-ocr

Run your own OCR-as-a-Service using Tesseract and Docker
Apache License 2.0
1.33k stars 223 forks source link

upload-local-file.py does not work as expected #125

Open venclov opened 5 years ago

venclov commented 5 years ago

I am seeing weird behaviour from upload-local-file.py file.

First of all, upload-local-file.sh works great for me, but then trying to use python file, I am first having error about encoding before sending the request:

UnicodeEncodeError: 'latin-1' codec can't encode character '\ufffd' in position 116: Body ('�') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

If I try to encode body with body.encode('utf-8') then I am able to send the request, but get response:

b'Error processing image url: . Error: exit status 1'

Docker logs show:

`16:16:16.137325 OCR_WORKER: got 37656 byte delivery: [18]. Routing key: decode-ocr Reply to: amq.gen-dxKrz2ZJ1Dh10rCfHDdt1Q
16:16:16.138806 OCR_TESSERACT: Use tesseract with bytes image

16:16:16.139015 OCR_TESSERACT: cmdArgs: [/tmp/a0e11ab6-b005-4ca2-65e5-e26550fcba91 /tmp/a0e11ab6-b005-4ca2-65e5-e26550fcba91] 16:16:16.274887 OCR_TESSERACT: Error exec tesseract: exit status 1 Tesseract Open Source OCR Engine v3.03 with Leptonica Error in pixReadStream: Unknown format: no pix returned Error in pixRead: pix not read Error in pixGetInputFormat: pix not defined Reading /tmp/a0e11ab6-b005-4ca2-65e5-e26550fcba91 as a list of filenames... Error in fopenReadStream: file not found Error in pixRead: image file not found Image file �PNG cannot be read! Error during processing. 16:16:16.275159 ERROR: Error processing image url: . Error: exit status 1 -- open-ocr.(OcrRpcWorker).resultForDelivery() at ocr_rpc_worker.go:183 16:16:16.275257 ERROR: Error generating ocr result. Error: exit status 1 -- open-ocr.(OcrRpcWorker).handle() at ocr_rpc_worker.go:145 16:16:16.275336 OCR_WORKER: Sending rpc response: {Error processing image url: . Error: exit status 1} 16:16:16.275382 OCR_WORKER: sendRpcResponse to: amq.gen-dxKrz2ZJ1Dh10rCfHDdt1Q 16:16:16.278552 OCR_WORKER: sendRpcResponse succeeded