Closed bertsky closed 11 months ago
039567a adds a simple web server (via socat
and sampo).
There are 3 4 endpoints:
match_uri '^/$' list_endpoints
match_uri '^/for_production|^/process_images' run_external_script for_production.sh
match_uri '^/for_presentation|^/process_mets' run_external_script for_presentation.sh
match_uri '^/cancel_job/(.*)$' run_external_script kill
Usage is simple: just convert any of the command-line args to CGI query options, e.g.
curl "http://localhost:4004/for_presentation/testdata-presentation/mets.xml?url-prefix=https://digital.slub-dresden.de/data/kitodo&workflow=/workflows/ocr-workflow-default.sh"
this will automatically translate to
for_presentation.sh --workflow /workflows/ocr-workflow-default.sh --url-prefix https://digital.slub-dresden.de/data/kitodo testdata-presentation/mets.xml
We now should document this, add some more logging, and start with a call-back interface (see #64).
@bertsky I tried out this feature and encountered the problem that constructing the JSON for Mongo fails as there seems to be linebreaks in the JSON. Maybe i introduced some problems while a was copying the logic for my tests, so that some unwanted line breaks got injected? https://github.com/slub/ocrd_manager/blob/52587e9e2c3e669bf832da3d4839a7254d1327fe/ocrd_lib.sh#L71
terminating with error $?=1 from HOME=/tmp mongosh --quiet --norc --eval "use ocrd" --eval "db.OcrdJob.insertOne( {#012 pid: $PID,#012 time_created: ISODate(\"$(date --rfc-3339=seconds)\"),#012 process_id: \"$PROCESS_ID\",#012 task_id: \"$TASK_ID\",#012 process_dir: \"$PROCESS_DIR\",#012 workdir: \"$WORKDIR\",#012 remotedir: \"$REMOTEDIR\",#012 workflow_file: \"$WORKFLOW\",#012 controller_address: \"$CONTROLLER\"#012 } )" $DB_CONNECTION on line 83 /usr/bin/ocrd_lib.sh
For the moment i formatted it in a way to have everything in one line.
Edit: It indeed seems like i introduced some problems whily copying. It works now
first attempt (still using bash for everything, so DB access only via mongosh)
It looks like when you restart a job for the same workspace, the new active job is not shown because internally it gets confused with the previous job for that workspace – so we probably need to redefine the index.