OCR-D / zenhub

Repo for developing zenhub integration
Apache License 2.0
0 stars 0 forks source link

Make ocrd-processors available inside webapi to be used with workflows #128

Closed joschrew closed 1 year ago

joschrew commented 2 years ago

The webapi(-code) runs inside a docker container. Currently nextflow is installed there as well, to be used for the workflows: (Dockerfile). But no Processors are available to that container(except ocrd-dummy). In this stage the code and the application is in the testing phase. In operandi the workflows will finally be execute via ssh in the hpc, but for the startup it would be nice to make executing processors possible without ssh and hpc just in the testing vm. I see 4 possibilities:

Maybe this is a trivial task but I couldn't find the time to test the possibilities I listed. But this is the point I think which has to do next to continue implementing the ocrd-webapi-spec.

kba commented 1 year ago

use ocrd-all as base image and extend that image to be able to run nextflow and the webapi.

I would strongly recommend this approach. Start your Dockerfile with FROM ocrd/all:maximum and you have all the processors available. Sure, it's a large image for now but you don't need to solve all the problems (Ubuntu dependencies, building C++ code, set correct paths, handle conflicting tensorflow versions etc.) that were already solved in ocrd_all.

joschrew commented 1 year ago

Service is available here: http://141.5.99.53:5050/. All processors from ocrd/all:maximum are available. Uploading nextflow-scripts is secured with password so far to prevent that anyone could execute any code.