OCR-D / core

Collection of OCR-related python tools and wrappers from @OCR-D
https://ocr-d.de/core/
Apache License 2.0
118 stars 31 forks source link

`ocrd network` CLI syntax and terminology #1032

Open kba opened 1 year ago

kba commented 1 year ago

From https://github.com/OCR-D/core/pull/974#issuecomment-1483588724

@MehmedGIT:

Okay, another note that will potentially be raised soon. Currently, the supported processing-worker CLI is still the old one:

# 1.
$ ocrd processing-worker <processor-name> --queue=<queue-address> --database=<database-address>

# 2.
$ <processor-name> --queue=<queue-address> --database=<database-address>

We (me and @joschrew) are aware that in the spec there is a change that is not adapted here, yet:

# 1. Use ocrd CLI bundled with OCR-D/core
$ ocrd server <processor-name> --type=worker --queue=<queue-address> --database=<database-address>

# 2. Use processor name
$ <processor-name> --server --type=worker --queue=<queue-address> --database=<database-address>

Where the --type can be either: worker (processing worker)

ocrd server <processor-name> --type=worker --queue=<queue-address> --database=<database-address>

or server (the REST API wrapper based on #884).

ocrd server <processor-name> --type=server --database=<database-address>

However, it's a bad idea to extend the processing worker code to support the REST API processing server (aka standalone processing worker that has nothing to do with the bigger Processing Server in the spec and does not need the queues).

Now, try to imagine a good way to explain to the OCR-D community without confusing them:

  1. why the <processor-name> is a server, but is not a server when --type=worker and is referred to with processing-worker and no direct requests can be sent
  2. why the <processor-name> is a server, and actual server when --type=server and is referred to with processing-server
  3. why both are grouped together under ocrd server... or ... --server ... but potentially implemented together under processing_worker.py
  4. why Processing Server (PS-big) and processing servers (PS-small), standalone ocrd processors, are different concepts but both referred to with processing server.

Sounds confusing? It's even more when it has to be implemented in a clean way. So there are 3 main reasons to not adapt that yet:

  • The priority changed a bit and the higher priority now is to deploy working Processing Server, Workflow Server, and Workspace Server together on a live VM instance, so KITODO can use that to continue their development.
  • Identify and fix problems that arise when combining the 3 servers above from 2 different repositories (Processing-Server #974 and reference WebAPI impl).
  • To not complicate the current implementation without first trying to think how to separate concepts properly and avoid potential problems. Ideally:
  1. the standalone processing servers (aka server processing workers) should be implemented as a separate class (name suggestions?), sibling of ProcessingWorker, and both will share the common methods.
  2. the CLI for both should be separated with improved naming conventions.
MehmedGIT commented 1 year ago

Thank you @kba! I have decided even against the --type=worker/server option because it shadows the type in Python... Instead, there are --agent_type=worker/server and --agent_address (when type is server) options. Check the #1030 implementation to see the currently used CLI syntax. I would suggest waiting on this topic till finishing #1030 for a review, then the CLI syntax will be revisited/refactored.

MehmedGIT commented 1 year ago

Some updates to this topic that are based on the current state of #1030.

Network modules/agents based on the WebAPI spec:

  1. Deployer Agent (currently used by the Processing Server to deploy other modules)
  2. Processing Server (processing proxy between Processing Workers and Processor Servers)
  3. Processing Worker
  4. Processor Server (Standalone REST OCR-D processor)
  5. RabbitMQ Server (The message broker between the Processing Server and the Processing Workers)
  6. MongoDB
  7. Workflow Server/Endpoint
  8. Workspace Server/Endpoint (Note: The Workspace/Workflow servers are currently available only here, and not part of the ocrd_network yet)
  9. Client

How are the network modules started, steps with example usage:

  1. ocrd network processing-server /path/to/processing-server-config --address <server-address>
  2. RabbitMQ and MongoDB are deployed on the addresses specified in the processing server configuration file with the aid of the Deployer agent which is currently part of the Processing Server.
  3. The Deployer agent, based on the configuration file, deploys:
    • Processing Workers: ocrd-* --type worker --database <address> --queue <address>

How is the client supposed to be used? There is no clear agreement yet, so consider everything that follows as my initial naive ideas! Before continuing with the Client CLI implementation, we need to decide what to support and what arguments/options to provide.

Currently, as part of #1030, only the processing/processor endpoint is (partially) implemented: ocrd network client processing processor <processor-name> --address <processing-server-address> ... arguments/options ... where the arguments and options are the well-known ones for the specific ocrd-processor. There are also some extensions (such as --result-queue-name and --callback-url) to allow more flexibility (when the client is another machine).

Then the processing status of a job can be checked with: ocrd network client processing status <job-id>. The job-id here is returned as a response to the previous request.

Similar to the example above, the client can support: ocrd network client <workspace/workflow/discovery> ... arguments/options ...

To keep the client CLI usage simple, the addresses of the servers can be configurated through environment variables. My current suggestion is the OCRD_NETWORK_SERVER_ADDR_* pattern where * can be one of PROCESSING/WORKSPACE/WORKFLOW. Clearly, we need more thoughts here for the standalone Processor Servers.

That's all from my side for now.

bertsky commented 1 year ago

Regarding the command line client, IMO it should be as consistent with the existing CLIs as possible. And I would prefer names of operations instead of HTTP mnemonics (POST/GET/PUT), for example:

So no status and no asynchronous / polling. (And that's independent of whether we make the Processor Server API itself blocking or not... the client could still implement a polling loop or callback or whatever is necessary.)

MehmedGIT commented 1 year ago

it should be as consistent with the existing CLIs as possible. And I would prefer names of operations instead of HTTP mnemonics (POST/GET/PUT)

Agree. I am also not a big fan of the HTTP mnemonics for the CLI.

Worth noting that the list option can get big really fast if the listing is not done just on user level when there are many users.

bertsky commented 1 year ago

Worth noting that the list option can get big really fast if the listing is not done just on user level when there are many users.

Yes, but adding user management like fastapi-users should not be difficult, IMO this is out of scope for core.

bertsky commented 1 year ago

@MehmedGIT we currently have the ocrd network client processing processor NAME --address ADDR --agent-type=worker (via publish_to_queue asynchronously) and --agent-type=server (via push_to_processor_server synchronously). The latter also uses the Processing Server – but shouldn't the client in that case try to connect to the Processor Server directly? Or would that be another independent client command (say ocrd network client processor ADDR)?

(Background: @joschrew in https://github.com/OCR-D/ocrd_all/pull/386 started implementing the ocrd_all-based deployment with docker compose for servers and the client CLIs – currently for the Processing Server model only. For the Processor Server model it would not make sense to use the Processing Server at all, but we have no client CLI yet.)

MehmedGIT commented 1 year ago

Or would that be another independent client command (say ocrd network client processor ADDR)?

It would be rather that to allow both ways.

For the Processor Server model it would not make sense to use the Processing Server at all, but we have no client CLI yet.

Right, the client CLI should be extended to support that.