Add `pending_requests` - Githubissues

krassowski commented 2 months ago

A part of a proposed solution to https://github.com/jupyter-server/jupyter_server/issues/990

Supersedes #197

krassowski commented 2 months ago

When user executes a long-running cell twice it could be:

{
   "pending_requests": [
      {
        # msg_id of an `shell.execute_request` (allowing to identify subsequent "iopub.status" by parent_header
        "id": "3ec6186d-dbd2-486e-9622-e771b4bdcc21",
        "type": "execute_request"
      },
      {
        # if user scheduled two executions of the same cell we will need to wait until both are cleared
        "id": "0acd4f22-e9cc-4dd1-969d-8aec793d987d",
        "type": "execute_request"
      }
   ]
}

When user runs a cell with input() it could be:

{
   "pending_requests": [
      {
        # the request for execution of a cell with `input()`
        "id": "6fd7fbd4-1c1d-4824-9c96-a57972b4be62",
        "type": "execute_request"
      },
      {
        # the response from the server
        "id": "0acd4f22-e9cc-4dd1-969d-8aec793d987d",
        "type": "input_request"
      }
   ]
}

Here is how an execution request and a subsequent iopub.status response looks on the wire:

And an execution followed by input_request - note no status reply yet because the input_request is pending:

Here are some thoughts:

do we want to prefix these properties with msg_ to align with msg_id and msg_type?
do we want to also store session ID from the header? In practice incoming messages would need to be compared against pending_requests and if msg_id is only unique within a given session then maybe we should store the session too

davidbrochart commented 2 months ago

I'm not sure we should go down the road of allowing concurrent cell execution. Aren't executions queued anyway?

krassowski commented 2 months ago

This is not about concurrent executions but queuing. We need to record the queue to resolve the status properly.

davidbrochart commented 2 months ago

In #197 there is an execution_state field which is equivalent to the status, why isn't that enough?

krassowski commented 2 months ago

To illustrate if I have a cell:

from time import sleep
sleep(60)

I execute it first time, the pending_requests would change to

[
  {
    "id": "3ec6186d-dbd2-486e-9622-e771b4bdcc21",
    "type": "execute_request"
  }
]

then while it is still running, say 5 seconds in, I execute it again; there are now two pending requests:

[
  {
    "id": "3ec6186d-dbd2-486e-9622-e771b4bdcc21",
    "type": "execute_request"
  },
  {
    "id": "0acd4f22-e9cc-4dd1-969d-8aec793d987d",
    "type": "execute_request"
  }
]

after another 55 seconds the first cell would finish and emit "idle" status; however, the cell should still display the "busy" status as there will be another 60 seconds until the second execution request completes and in the meantime the pending_requests will be:

[
  {
    "id": "0acd4f22-e9cc-4dd1-969d-8aec793d987d",
    "type": "execute_request"
  }
]

davidbrochart commented 2 months ago

But if we don't process new execution requests until the cell's execution state is idle, we don't need to record pending executions?

krassowski commented 2 months ago

In https://github.com/jupyter-server/jupyter_ydoc/pull/197 there is an execution_state field which is equivalent to the status, why isn't that enough?

Good question. There are at least two use cases which cannot be resolved with simple execution_state field alone, unless we move all execution logic from the frontend to the server:

recording timing
stdin input boxes

Currently, frontend records cell execution timing to cell metadata. If we only keep "status" flag there is no way for frontend to pick up the new message after reloading. Say I execute a cell twice with sleep(60) and then reload the page in 30th second. If all I get is "busy" status from the model, then when the "iopub.status" message with "idle" comes at 60th second I cannot reconcile it with the cell metadata. I can still stop the timer at 120th second because the model will then switch to "idle", but the timer will show execution time of 120 seconds (rather than 60 seconds).

Similarly for input boxes, the frontend needs to know if the input request is still pending. If it does not know, it can get into a kernel deadlock state.

Now, the current proposal is not ideal for the timing scenario either because if I run a cell with sleep(60) and close the browser tab, and then reopen after 70 seconds, then the frontend cannot set the elapsed time properly - all that it can infer is that the cell is no longer running (because pending_requests would be empty).

So maybe we should populate the cell execution timing metadata on the server. For reference, the timing code is here in JupyterLab. However, note, this is not only about setting the right timings, but also about notifying the extensions via a signal. Maybe it would be fine to populate metadata on server and in the scenario described earlier (a single cell executed twice with browser tab refreshed in the middle of the first execution) the frontend would watch to changes to that metadata field, and emit the signal respectively.

But if we don't process new execution requests until the cell's execution state is idle, we don't need to record pending executions?

See above, but also I don't think we have any control on when the kernel processes the execution requests. We could do as you say (don't process next until previous returned idle), but I think that the messaging protocol says that queuing is handled by the kernel and it is its implementation detail; changing it to what you propose should not be a problem unless there is a substantial latency between server and kernel so maybe for remote kernels.

davidbrochart commented 2 months ago

I think this underlines the need to move to server-side execution. With these changes, I feel that the kernel protocol is leaking into the shared model, but with server-side execution the kernel protocol is exactly what we want to get rid of (in frontends). I agree that timings should be recorded server-side. For input requests, my vision is that they should be collaborative anyway, maybe using ypywidgets. That way, every client will see the input box with live changes from other peers. I agree that what I'm proposing needs a lot of changes in different parts, but I think this should be the direction. And I fear that if we don't take that direction we are going to end up in a mixed state where the CRDT-based architecture will be "polluted" by low-level details of the non-CRDT system that we are precisely trying to get rid of.

krassowski commented 2 months ago

For input requests, my vision is that they should be collaborative anyway, maybe using ypywidgets. That way, every client will see the input box with live changes from other peers.

I can see that for plain text input, but password boxes would need special treatment (from getpass import getpass; getpass())?

davidbrochart commented 2 months ago

We could imagine a special widget that would not show the password, but would still allow any client to enter it?

jupyter-server / jupyter_ydoc

Add `pending_requests` #227