Closed jbarth-ubhd closed 1 year ago
As the readme states, deskew needs osd.traineddata
.
Your TESSDATA_PREFIX approach gets you using a custom model directory for Tesseract which has not been filled by the installer (ocrd_all compiles it with /usr/local/share/tessdata, also used as module
resource location for ocrd_tesserocr).
Please follow the OCR-D user guide for Docker, which gives the following cmdline:
docker run --user $(id -u) --workdir /data --volume $PWD:/data --volume $PWD/models:/usr/local/share/ocrd-resources --volume $PWD/models:/usr/local/share/tessdata --volume $PWD/models:/usr/local/share/ocrd-resources -it ocrd/all bash
(where bash
can of course be replaced with any single processor call or workflow script)
Thanks!
I've tried to run a workflow (ocrd.sif built from docker ocrd/all:maximum 2023-06-13 approx. 18:00 CEST) with this files:
https://digi.ub.uni-heidelberg.de/diglitData/v/duerer1527_-_aa2.tgz
main command = run.sh
and got the following error message (excerpt from ocrd.log):
see ocrd.log in linked .tgz above for complete log.
Did add an
ls -l
the directory named in the error message... possibly an invalid tessdata path
Note: ocrd-tesserocr-crop did run before that without error.
Complete listing of files: