OCR-D / ocrd_tesserocr

Run tesseract with the tesserocr bindings with @OCR-D's interfaces
MIT License
39 stars 11 forks source link

ocrd-tesserocr-recognize and docker #161

Closed gpetz closed 3 years ago

gpetz commented 3 years ago

A docker run --rm -u $(id -u) -v $PWD:/data -w /data -- ocrd/all:maximum ocrd-tesserocr-recognize -I OCR-D-SEG-LINE -O OCR-D-OCR-TESS -p '{​​​​​​​"model": "deu+frk"}​​​​​​​' fails with: ValueError: Error parsing '{​​​​​​​"model": "deu+frk"}​​​​​​​': Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

Outside docker ocrd-tesserocr-recognize -I OCR-D-SEG-LINE -O OCR-D-OCR-TESS -p '{​​​​​​​"model": "deu+frk"}​​​​​​​' works as expected. What can I do with the double quotes?

kba commented 3 years ago

The easiest fix is to use the -P variant of the -p flag that prevents a lot of the quoting headaches:

docker run --rm -u $(id -u) -v $PWD:/data -w /data -- ocrd/all:maximum ocrd-tesserocr-recognize -I OCR-D-SEG-LINE -O OCR-D-OCR-TESS -P model "deu+frk"

bertsky commented 3 years ago

I agree – it's not a ocrd_tesserocr issue. More of a general docker run issue. There are other workarounds to this (like placing the actual command inside a bash -c "...", because shells can escape and add extra layers of quotation), but here -P works best.

Can we close?

gpetz commented 3 years ago

Thanx!