Make ocrd-processors available inside webapi to be used with workflows

joschrew commented 2 years ago

The webapi(-code) runs inside a docker container. Currently nextflow is installed there as well, to be used for the workflows: (Dockerfile). But no Processors are available to that container(except ocrd-dummy). In this stage the code and the application is in the testing phase. In operandi the workflows will finally be execute via ssh in the hpc, but for the startup it would be nice to make executing processors possible without ssh and hpc just in the testing vm. I see 4 possibilities:

use ocrd-all as base image and extend that image to be able to run nextflow and the webapi.
install the processors in the current Dockecontainer via pip (is this possible?)
get rid of the Docker-Setup for the webapi-Container and install everything (ocrd, fastapi, nextflow) directly into the VM
start the processors in dockercontainers with this functionality https://github.com/OCR-D/core/pull/884. Than modify nextflow-scripts to query the processors running in the dockercontainers

Maybe this is a trivial task but I couldn't find the time to test the possibilities I listed. But this is the point I think which has to do next to continue implementing the ocrd-webapi-spec.

kba commented 2 years ago

use ocrd-all as base image and extend that image to be able to run nextflow and the webapi.

I would strongly recommend this approach. Start your Dockerfile with FROM ocrd/all:maximum and you have all the processors available. Sure, it's a large image for now but you don't need to solve all the problems (Ubuntu dependencies, building C++ code, set correct paths, handle conflicting tensorflow versions etc.) that were already solved in ocrd_all.

joschrew commented 2 years ago

Service is available here: http://141.5.99.53:5050/. All processors from ocrd/all:maximum are available. Uploading nextflow-scripts is secured with password so far to prevent that anyone could execute any code.

OCR-D / zenhub

Make ocrd-processors available inside webapi to be used with workflows #128