How to get `ocrd-tool.json` if processors not installed in processing server?

kba commented 1 year ago

@tdoan2010:

This implementation requires that all supported processors must be installed on the same machine with Processing Server as well, which might not be the case. Maybe after integrating #884, we can send requests to each processor to ask for its information instead.

@bertsky:

I concur – see earlier discussion above.

@MehmedGIT:

Maybe after integrating #884, we can send requests to each processor to ask for its information instead.

The processing worker is not a server anymore to send requests there. I still have no clear idea how to achieve that. The best idea I have found so far is to store the ocrd tool jsons in the DB so the Processing Server can retrieve the information from there.

bertsky commented 1 year ago

Discussion continued as follows:

@kba suggested using the static ocrd-all-tool.json files as a stop-gap.
@MehmedGIT added #1028 to utilise that in the Processing Server
@bertsky provided https://github.com/OCR-D/ocrd_all/pull/362 as a path to a more automatic generation of ocrd-all-tool.json via CI (as Github build artifact)
@bertsky commented

A central static tool JSON list is of course a solution for now, but generally IMO we must overcome this. The tools actually deployed might change – even dynamically – and new tools may arrive (cf. discussion on fallback queues). We could have the workers respond with their tool JSON immediately after startup, couldn't we?
@MehmedGIT replied

The Processing Server then will know only the tool JSON of the started Processing Workers. In case another worker is started manually and the queue for it is created manually, the Processing Server will not be able to validate it's parameters or return it's ocrd tool when requested.
@bertsky agreed

Oh, right. So for unmanaged queues, we would still need a mechanism to advertise/register the tool.
@MehmedGIT added

This is not a problem when the Processor Servers are available. The Processing Server can then just request each specific Processor Server for the ocrd_tool and cache.
@bertsky cautioned

You mean Processor Servers besides Processor Workers? ... and in the Processing Server model we don't use Processor Servers, do we?
@MehmedGIT elaborated

No, we don't. That's why this tool thing is complicated. There is no way of the Processing Server knowing or receiving the tool directly from the Processing Worker ... Another route, a better one, would be to provide "register worker" end point to the Processing Server and pass the ocrd tool there. Once the deployer is separated, the deployer will deploy the Processing Workers and register them to the Processing Server together with their ocrd tool json
@bertsky acceded

Yes, explicit worker registration might be the better cooperation model anyway. So I'd say the Processing Worker should respond with its tool JSON when created. If the creator is the Processing Server, then it can update its internal tool JSON cache. If it is some external actor, they must send that tool JSON along with the tool name within some (to be defined) worker registration endpoint. So ultimately, the Processing Server can be extended dynamically, but only via explicit registration. (And if we do that, we can as well have that endpoint create the queue itself.)

So we seem to agree that all workers ( / processor queues) should be registered ( / created) centrally on the Processing Server (via endpoint or from configuration at startup), and that new Processing Workers should output their ocrd-tool.json immediately, so that can be used by the registration to store all JSONs in a tool cache dynamically.

bertsky commented 1 year ago

BTW I believe for the full Web API including /discovery, we would need central worker registration anyway.

MehmedGIT commented 1 year ago

So we seem to agree that all workers ( / processor queues) should be registered ( / created) centrally on the Processing Server (via endpoint or from configuration at startup)

This will come after #1030. #1030 will already be big to handle that change inside as well. What I currently have in mind for the near future is:

the deployer as a separate network agent
the ProcessingServerConfig will rather be called DeployerConfig
the Processing Server will know nothing about configurations anymore
the DeployerConfig will potentially be extended to be able to deploy Workflow Server and Workspace Servers (in the reference WebAPI impl) as well
the Deployer agent will deploy RabbitMQ Server, MongoDB, and Processing Server as a 1st step
the Deployer will deploy Processing Workers and Processor Servers as a 2nd step
the Deployer will register the deployed agents in step 2 to the Processing Server through an endpoint (separate endpoint for each).
the Processing Server will then create the Process Queues (i.e., RabbitMQ Queues) based on the registered Processing Workers
if a Process Queue with the same name as the registered worker processor name already exists, no queue will be created

Any other suggestions/modifications?

MehmedGIT commented 1 year ago

Ideas for a bit later in time (not even sure for when):

to increase the robustness of the entire network, an Observer agent can be introduced to observe the live status of the deployed agents by pinging them every then and now.
in case something went down, try to apply different strategies: 1) try to redeploy, 2) inform other network agents so they can block certain endpoints (to not fill the storage with unprocessable requests), 3) send an e-mail notification ...
problem 1: the Processing Workers are not servers. However, maybe there is an easy way to find their live status through the RabbitMQ Server since they are registered there as Consumers.
problem 2: the Processing Workers and Processor Servers registered through the Processing Server after the deployment stage may be more complicated to register to the Observer.

Disclaimer: Potentially, this will be too time-consuming to implement and cause errors without having good automatic testing mechanisms for the entire network and agents working together.

MehmedGIT commented 1 year ago

BTW I believe for the full Web API including /discovery, we would need central worker registration anyway.

True. We still need to think about how exactly this should happen - i.e., which network agent takes responsibility to handle the central registration. This is currently the Processing Server.

bertsky commented 1 year ago

BTW I believe for the full Web API including /discovery, we would need central worker registration anyway.

True. We still need to think about how exactly this should happen - i.e., which network agent takes responsibility to handle the central registration. This is currently the Processing Server.

Yes, it makes most sense there, because the Processing Server is the one that needs to know who to talk to anyway. So via registration it has the ultimate truth on processor_list etc. and could provide its own /discovery, which can be delegated to by the Workflow Server's /discovery.

Deployments should also be backed by the database BTW, in case the PS crashes...

MehmedGIT commented 1 year ago

Yes, it makes most sense there, because the Processing Server is the one that needs to know who to talk to anyway.

For processing, yes. What if the /discovery needs to be extended? Say the client wants to discover available Workspace/Workflow servers. Then the Deployer has the central knowledge of where things were deployed.

Deployments should also be backed by the database BTW, in case the PS crashes...

Agree.

MehmedGIT commented 1 year ago

the DeployerConfig will potentially be extended to be able to deploy Workflow Server and Workspace Servers (in the reference WebAPI impl) as well

the Deployer agent will deploy RabbitMQ Server, MongoDB ...

These are no longer valid... The RabbitMQ Server, MongoDB, Workflow Server, and Workspace Server will be deployed with docker-compose.

@tdoan2010

tdoan2010 commented 1 year ago

I don't know how the discussion went to this topic, which is completely not relevant to the title of this issue. But yes, the Processing Server will only be responsible for Processor Servers. The rest must be managed by another way outside the Processing Server.

The final goal is to have a docker-compose file, which can be used to start up all necessary components.

OCR-D / core

How to get `ocrd-tool.json` if processors not installed in processing server? #1034