slub / ocrd_manager

frontend for ocrd_controller and adapter towards ocrd_kitodo
MIT License
11 stars 3 forks source link

use mongosh connection to ocrd-database for job infos #62

Closed bertsky closed 11 months ago

bertsky commented 1 year ago

first attempt (still using bash for everything, so DB access only via mongosh)

It looks like when you restart a job for the same workspace, the new active job is not shown because internally it gets confused with the previous job for that workspace – so we probably need to redefine the index.

bertsky commented 1 year ago

039567a adds a simple web server (via socat and sampo).

There are 3 4 endpoints:

match_uri '^/$' list_endpoints
match_uri '^/for_production|^/process_images' run_external_script for_production.sh
match_uri '^/for_presentation|^/process_mets' run_external_script for_presentation.sh
match_uri '^/cancel_job/(.*)$' run_external_script kill

Usage is simple: just convert any of the command-line args to CGI query options, e.g.

curl "http://localhost:4004/for_presentation/testdata-presentation/mets.xml?url-prefix=https://digital.slub-dresden.de/data/kitodo&workflow=/workflows/ocr-workflow-default.sh"

this will automatically translate to

for_presentation.sh --workflow /workflows/ocr-workflow-default.sh --url-prefix https://digital.slub-dresden.de/data/kitodo testdata-presentation/mets.xml

We now should document this, add some more logging, and start with a call-back interface (see #64).

BartChris commented 11 months ago

@bertsky I tried out this feature and encountered the problem that constructing the JSON for Mongo fails as there seems to be linebreaks in the JSON. Maybe i introduced some problems while a was copying the logic for my tests, so that some unwanted line breaks got injected? https://github.com/slub/ocrd_manager/blob/52587e9e2c3e669bf832da3d4839a7254d1327fe/ocrd_lib.sh#L71

terminating with error $?=1 from HOME=/tmp mongosh --quiet --norc --eval "use ocrd" --eval "db.OcrdJob.insertOne( {#012           pid: $PID,#012           time_created: ISODate(\"$(date --rfc-3339=seconds)\"),#012           process_id: \"$PROCESS_ID\",#012           task_id: \"$TASK_ID\",#012           process_dir: \"$PROCESS_DIR\",#012           workdir: \"$WORKDIR\",#012           remotedir: \"$REMOTEDIR\",#012           workflow_file: \"$WORKFLOW\",#012           controller_address: \"$CONTROLLER\"#012      } )" $DB_CONNECTION on line 83 /usr/bin/ocrd_lib.sh

For the moment i formatted it in a way to have everything in one line.

Edit: It indeed seems like i introduced some problems whily copying. It works now