Since Kitodo.Production will unlikely have access to a Docker installation (i.e. be able to docker run ocrd_manager something from a script task) or even a native OCR-D core installation (i.e. be able to install core in the system and then sh initialsetup.sh something) – we should use the same base recipe as in https://github.com/bertsky/ocrd_controller.
[x] install openssh-server in Dockerfile, provide unprivileged access for pseudo-user ocrd
[x] provide some callable (say: shell script taking parameters) which will
(perhaps: transfer the process data / Vorgangsdaten to the controller)
log into the controller and run the predefined workflow
(perhaps: retransfer the results from the controller)
post-process the results by
validating the whole OCR-D workspace
determining which are the result fileGrps (from either the workflow definition or the position of the fileGrps in the METS or the timestamps of the subdirectories)
copying/moving the result files (for now: ALTO, later: PAGE / PDF / TEI / ...) to a path in the process directory where Production expects them (ocr/%08d.xml I guess), by iterating over the METS with ocrd workspace find
signalling the exit status via ActiveMQ
[x] provide a smoke test for the callable (preferably via a Makefile):
Since Kitodo.Production will unlikely have access to a Docker installation (i.e. be able to
docker run ocrd_manager something
from a script task) or even a native OCR-D core installation (i.e. be able to install core in the system and thensh initialsetup.sh something
) – we should use the same base recipe as in https://github.com/bertsky/ocrd_controller.ocrd
ocr/%08d.xml
I guess), by iterating over the METS withocrd workspace find