Closed xf0e closed 5 years ago
Thanks for the contribution! I verified that it builds locally, and triggered new docker images on dockerhub. (still processing)
Hi guys, great work! i Tried this feature but even for very small PDFs (i.e. 2 pages) i got
Unable to perform OCR decode. Error: Timeout waiting for RPC response
Any ideas why this happens. I use tesseract3 insides the containers
Any logs on the containers? I'm guessing it failed with some sort of error that didn't get propagated back.
@tleyden
I have same issue. Logs: `
Can you get logs on the worker container? Or maybe there isn't one running, which would explain the timeout.
What does docker ps
return?
Worker container log is:
27T22:14:11.302615900Z 22:14:11.302272 OCR_WORKER: Creating new OCR Worker
22:14:11.302392 OCR_WORKER: Run() called...
22:14:11.302409 OCR_WORKER: dialing "amqp://admin:Phaish9ohbaidei6oole@rabbitmq/"
22:14:11.320177 OCR_WORKER: got Connection, getting Channel
22:14:11.322389 OCR_WORKER: binding to: decode-ocr
22:14:11.323148 OCR_WORKER: Queue bound to Exchange, starting Consume (consumer tag "foo")
I have 4 containers running, docker ps
outputs (some colums cleared for clarity)
b0055fbecbde . tleyden5iwx/open-ocr-2 docker-compose_openocr_1
b8be2302936c . tleyden5iwx/open-ocr-preprocessor docker-compose_strokewidthtransform_1
ae51cccc7094 tleyden5iwx/open-ocr-2 docker-compose_openocrworker_1
9904e5507ac7 . rabbitmq:3.6.5-management docker-compose_rabbitmq_1
Line
command: "/opt/open-ocr/open-ocr-preprocessor -amqp_uri amqp://admin:Phaish9ohbaidei6oole@rabbitmq/ -preprocessor stroke-width-transform"
of docker-compose.yml shoud be changed to:
command: "/opt/open-ocr/open-ocr-preprocessor -amqp_uri amqp://admin:Phaish9ohbaidei6oole@rabbitmq/ **-preprocessor convert-pdf"
if I am right?
hello darmanovic, sorry, i edited the first post. The preprocessor args should be "-preprocessor convert-pdf" and should not contain "**". The stars are just typos.
I suspected that stars are typos, but when I remove them, container won't run at all.
LINE:
command: "/opt/open-ocr/open-ocr-preprocessor -amqp_uri amqp://admin:Phaish9ohbaidei6oole@rabbitmq/ -preprocessor convert-pdf"
LOG:
15:52:17.985590 PREPROCESSOR_WORKER: Creating new Preprocessor Worker
15:52:17.986118 PANIC: Could not create rpc worker: No preprocessor found for: "convert-pdf" -- main.main() at main.go:47
panic: Could not create rpc worker: No preprocessor found for: "convert-pdf"
2019-02-28T15:52:17.990229700Z
goroutine 1 [running]:
runtime.panic(0x627e80, 0xc210042940)
/usr/lib/go/src/pkg/runtime/panic.c:266 +0xb6
github.com/couchbaselabs/logg.LogPanic(0x7374d0, 0x1f, 0x7efe16e9ae78, 0x1, 0x1)
/opt/go/src/github.com/couchbaselabs/logg/logg.go:136 +0xec
main.main()
/opt/go/src/github.com/tleyden/open-ocr/cli-preprocessor/main.go:47 +0x200
Same error as @darmanovic. Someone solved it?
Hi, all. Please have a look at https://github.com/tleyden/open-ocr/issues/117 for a follow-up on this error.
The tesseract engine now can be confronted with pdf files. This is achieved by a new ConvertPdf preprocessor.
Usage:
Internal we are calling gs to create a multi page TIFF from our input. The ImageMagick won't work for this purpose because it creates a single paged image files which tesseract can't handle. e.g.
Regards!