slub / ocrd_manager

frontend for ocrd_controller and adapter towards ocrd_kitodo
MIT License
10 stars 3 forks source link

Webhook implementation #66

Closed markusweigelt closed 7 months ago

markusweigelt commented 1 year ago

The webhook implementation is crucial for making our callback applicable in various scenarios by generalizing its usage for Kitodo.Presentation, Kitodo.Production and maybe other applications like an internal ticket system or something else.

The existing Kitodo.Production Active MQ implementation within the ocrd_lib.sh script should be replaced with the generalized webhook functionality for this purpose.

The Active MQ functionality should be extracted into a separate "kitodo_production_endpoint_script," which will only be utilized within the "for_production" script. The idea is to initiate an HTTP proxy, whose URL will be passed as the webhook receiver URL, and this proxy will forward the request to the Active MQ. Additionally, it would also be feasible to install the script on the Kitodo.Production system and make the endpoint accessible there.

Currently, the following webhook events are planned maybe with a prefix "ocrd_":

Are there any more events here that should be taken into consideration? What is missing or should it be further described?

bertsky commented 1 year ago

Is this meant to solve #64? Why do you throw out ActiveMQ – I thought we would like to keep multiple options (maybe even in parallel)?

markusweigelt commented 1 year ago

@bertsky I've modified the pull request description. If there are additional questions or concerns, let's address them in more detail through the discussion.

fyi I haven't added you as a reviewer yet, as the pull request is not at that stage yet.

bertsky commented 1 year ago

Ok, the edited description still does not explain it explicitly, so I will try to say it in my own words – please CMIIW.

The reason for throwing out the ActiveMQ mechanism (replacing it with the more general REST interface which is easier to reuse) would be that we want to keep ocrd_lib.sh (ActiveMQ client) and the entry point script CLIs (no ACTIVEMQ / ACTIVEMQ_CLIENT / ACTIVEMQ_QUEUE runtime variables) simple.

However, to still be able to report back to Kitodo.Production, we not only then have to set up another service acting as a kind of proxy (translating the POSTed JSON data to the former ActiveMQ message). (That part is still missing from the PR.) We also have to keep the --process-id and --task-id interface, because we definitely need that information. (So this part must be optional for the callback interface itself.)

Regarding terminology, please use callback instead of webhook.

markusweigelt commented 1 year ago

However, to still be able to report back to Kitodo.Production, we not only then have to set up another service acting as a kind of proxy (translating the POSTed JSON data to the former ActiveMQ message). (That part is still missing from the PR.) We also have to keep the --process-id and --task-id interface, because we definitely need that information. (So this part must be optional for the callback interface itself.)

The proxy could be provided within the scope of the repository Kitodo-Production ActiveMQ and wouldn't need to be versioned in the Manager. The proxy could be provided as a standalone script, which could then be easily installed into an existing Kitodo.Production instance. It's conceivable that it could also be installed here in the Dockerfile of Kitodo-Production ActiveMQ.

In the ocrd_lib.sh, the parameters are currently being returned to the receiver. Currently, the necessary parameters for process_mets.sh and process_images.sh are provided here, without distinguishing which ones are actually required. Therefore, I would like to implement a mechanism similar to parse_args as an abstract function, which allows setting the necessary parameters for the receiver in the respective script.

Regarding terminology, please use callback instead of webhook.

I find the term 'webhook' more suitable than 'callback' and would like to keep it that way.

https://www.svix.com/resources/faq/webhook-vs-callback/

markusweigelt commented 9 months ago

@bertsky Changed the asynchronous behavior. Synchronous is now the default.