kbaseincubator / jsonrpcbase

MIT License
0 stars 0 forks source link

Optional thread or process based batch execution #6

Open jayrbolton opened 4 years ago

jayrbolton commented 4 years ago

For a batch request, can compute the responses in a thread or process pool. By default it can be serial, but some configuration when creating the JSONRPCService class could allow for thread/process pool

xpe commented 4 years ago

To quote the section we discussed on Friday:

https://www.jsonrpc.org/specification#batch 6 Batch

To send several Request objects at the same time, the Client MAY send an Array filled with Request objects.

The Server should respond with an Array containing the corresponding Response objects, after all of the batch Request objects have been processed. A Response object SHOULD exist for each Request object, except that there SHOULD NOT be any Response objects for notifications. The Server MAY process a batch rpc call as a set of concurrent tasks, processing them in any order and with any width of parallelism.

The Response objects being returned from a batch call MAY be returned in any order within the Array. The Client SHOULD match contexts between the set of Request objects and the resulting set of Response objects based on the id member within each Object.

If the batch rpc call itself fails to be recognized as an valid JSON or as an Array with at least one value, the response from the Server MUST be a single Response object. If there are no Response objects contained within the Response array as it is to be sent to the client, the server MUST NOT return an empty Array and should return nothing at all.

xpe commented 4 years ago

Speaking of batch calls, ordering (namely, the lack thereof), and parallelism -- to conform with the spec, the library's default behavior should match. That said, as Jay and I discussed, this library could potentially give the client more control over how the tasks in a batch are executed (such as ordering and parallelism). This would be an extension (one way or the other).

Putting aside whether this is a good/bad or useful/not useful idea, I would like to point out that the JSON RPC 2.0 spec does specify any extension mechanism (to my knowledge).

jayrbolton commented 4 years ago

For better parallelism control, we could have "system extension" calls. I don't have a use case for it, but just for fun:

"rpc.parallel" would allow you to create a DAG of tasks with input dependencies. In the below example, task1 and task2 execute in parallel, then task 3 and task 4 in a series after them:

{
    "jsonrpc": "2.0",
    "id": 0,
    "method": "rpc.parallel",
    "params": {
        "task1": {
            "method": "..."
        },
        "task2": {
            "method": "..."
        },
        "task3": {
            "input": ["task1", "task2"],
            "method": "..."
        },
        "task1": {
            "input": ["task3"],
            "method": "..."
        }
    }
}

You could also have the ability to exit the DAG on the first error and call a final function with the results

xpe commented 4 years ago

Ah, the "rpc." prefix... now I see the extensions section at the bottom of the JSON-RPC 2.0 spec. This prompted me to dig around on JSON-RPC extensions. See Issue #17: "What is known about extensions?"

xpe commented 4 years ago

I've thought about this idea somewhat, and I've dug around the JSON-RPC discussion group for context and ideas.

If we frame the question as, "In what situations would extending JSON-RPC 2.0's execution model make sense?", I have some thoughts:

  1. If a system uses streaming, there might be a case for an extension that allows multiple responses for one request. See issue #18, "JSON-RPC is limited to one response per request".
  2. If a client want to make a sequence of calls and wants to avoid many JSON-RPC round-trips, there might be a case for some kind of chained computation extension.
    • One kind of chained execution would appear simple: a linear sequence with 'bailout points'. Such an extension would need to design (at least): (1) how results from one method flow to the next and (2) how errors are handled.
    • Another kind of chained execution would be a DAG (directed acyclic graph), such as along the lines that @jayrbolton mentions above.

Such an extension probably would need to ask many of these questions:

  1. So, there might be a lot of complexity involved with chained computations. Such complexity might we warranted if the client requires a lot of flexibility in specifying the relationships.

  2. But in a lot of cases, I expect the client would have a small number of computation configurations. If so, these could simply be 'baked-in' to an additional JSON-RPC method.

xpe commented 4 years ago

See also: Proposal: Extended Batch Mode from the JSON-RPC discussions group.

The extended Batch Mode allows Requests to reference Results from other Requests in the Batch as param values. This mode would have side-effects on processing, the server has to process referenced Requests before referring requests.

jayrbolton commented 4 years ago

In the DAG model, results from a previous node would be passed directly into the next node, so would be in the format of jrpc method params.

I just want to emphasize here that we don't have use cases for this and I would bet it's unneeded.

On Sun, May 31, 2020, 18:18 David James notifications@github.com wrote:

See also: Proposal: Extended Batch Mode https://groups.google.com/d/msg/json-rpc/pLzHOSFJTbI/Hob1YDkzBwAJ

The extended Batch Mode allows Requests to reference Results from other Requests in the Batch as param values. This mode would have side-effects on processing, the server has to process referenced Requests before referring requests.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kbaseIncubator/jsonrpcbase/issues/6#issuecomment-636565275, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCV47DAOVIRUN4QVSNGUDLRUL6XBANCNFSM4NNMK7NA .

xpe commented 4 years ago

I just want to emphasize here that we don't have use cases for this and I would bet it's unneeded.

I feel the same way.