New kernels REST API - Githubissues

davidbrochart commented 2 years ago

Problem

Currently, the kernel protocol over ZMQ in the back-end is "forwarded" to the front-end over a WebSocket. This brings a lot of complexity to the front-end, which has to speak this protocol. With JupyterLab moving to RTC, the (notebook) UI becomes merely a shared document editor: the user enters some text in a cell, asks for execution, sees outputs being populated, clears outputs... There is no need for the front-end to e.g. interpret IOPub messages in order to display results, all this could be done in the back-end and directly modify the shared document, which would automatically update in the front-end.

Proposed Solution

We could create a much more simple REST API for kernels, which would consist of e.g. "POST execute request", etc. All kernel state information (idle, busy, dead, restarting...) could go to the new event system.

davidbrochart commented 2 years ago

For the back-end to be able to make changes to a shared document based on some code execution, the back-end has to know which (part of a) document the code is linked to. For instance in a notebook, a cell input contains code that will produce outputs in the same cell. This means that this REST API must provide information about the document to modify, as well as the kernel to use. A notebook cell "POST execute request" should consist of the following body:

{
  "document_id": "path/to/notebook.ipynb",
  "cell_id": "b4d4a3fe-bc05-4965-b5eb-94732e4329ae",
  "kernel_id": "c71c23dd-38ba-4967-aa46-415ba7317012"
}

Zsailer commented 2 years ago

I would be in favor of extending the current kernels API in core to include a new REST endpoint for executing code. Something like

POST /api/kernels/<kernel_id>/execute

Body: 
{
  "document_id": "path/to/notebook.ipynb",
  "cell_id": "b4d4a3fe-bc05-4965-b5eb-94732e4329ae",
  "kernel_id": "c71c23dd-38ba-4967-aa46-415ba7317012",
  "code": ...
}

Response:
{
  "document_id": "path/to/notebook.ipynb",
  "cell_id": "b4d4a3fe-bc05-4965-b5eb-94732e4329ae",
  "kernel_id": "c71c23dd-38ba-4967-aa46-415ba7317012"

}

This wouldn't require a websocket to the kernel for two-way comms. It simply sends an execute_request to the kernel. On the server, this request would be handled by the KernelManager and formulated into a proper ZMQ message to be sent across kernels shell channel.

I don't think we should replace any of the the current kernel's REST/Websocket API, though. The kernel websocket APIs are still heavily used by projects other than JupyterLab. We don't want those endpoints to degrade or deprecate anytime soon.

As for checking kernel status, I think it also makes sense to add a GET /api/kernels/<kernel_id>/status endpoint that sends a message to IOPub for kernel status and wait for a response. The event bus websocket is also a good way to shuttle these messages.

davidbrochart commented 1 year ago

This work is happening in https://github.com/jupyter-server/jupyverse/pull/191.

echarles commented 1 year ago

There is no need for the front-end to e.g. interpret IOPub messages in order to display results,

What would the impact for the Comm messages used by IPyWidgets?

davidbrochart commented 1 year ago

That's a good point, currently ipywidgets won't work with this, but there are discussions about using y-py in ipywidgets, which would be very aligned with this REST API in my opinion.

echarles commented 1 year ago

but there are discussions about using y-py in ipywidgets

Any pointers? Any backwards compatibly requirement taken in that discussion?

davidbrochart commented 1 year ago

I think it has been discussed in some meetings, but it's probably not going to happen anytime soon. Regardless of ipywidgets, this kernels REST API could make it easier for new collaborative front-ends which don't want to bother about low-level kernel protocols. I think it can lower the bar quite a lot in that regard.

vidartf commented 1 year ago

I think it would be important to make it possible for clients to (easily) combine RTC experience with Comms. Having an example of how to do this would be helpful (but I agree that it might not be needed in the first iteration).

davidbrochart commented 1 year ago

I opened an issue in ipywidgets.

Wh1isper commented 1 year ago

I implemented a simple generic client based on the websocket interface (I tried using jupyter_client's zmq interface, but it was hard to use), and then implemented a plugin with fileid support to implement this functionality.

client: https://github.com/Wh1isper/jupyter_kernel_client extension: https://github.com/Wh1isper/jupyter_kernel_executor

Looking forward to your suggestions ! @davidbrochart @Zsailer

kevin-bates commented 1 year ago

I would strongly suggest you use the AsyncMappingKernelManager. That way your kernels will be managed by the ServerKernelManager where you'll be able to leverage future functionality.

fcollonval commented 1 month ago

We have started the work for an jupyter_server extension at https://github.com/datalayer/jupyter-server-nbmodel

davidbrochart commented 1 month ago

This repo doesn't seem to be public?

fcollonval commented 1 month ago

Thanks @davidbrochart for the ping - it should now be public

davidbrochart commented 1 month ago

Confirmed, thanks! @Zsailer might be interested too.

echarles commented 1 month ago

Looks like the current implementations target two goals:

Run some code via a HTTP POST endpoint (loosing the "streaming" feature of the output).
Have an endpoint to move the complete model to the server.

Do we want to serve those 2 goals via the same endpoint or should we separate them. (PS: This may have impact on the existing jupyterlab implementation).

davidbrochart commented 1 month ago

Run some code via a HTTP POST endpoint (loosing the "streaming" feature of the output).

Why would output streams be lost?

fcollonval commented 1 month ago

Demo for https://github.com/datalayer/jupyter-server-nbmodel using RTC model to update the notebook model on the server side from the kernel outputs (with https://github.com/jupyterlab/jupyter-collaboration/pull/307)

demo_server_side_execution_rtc

echarles commented 1 month ago

Why would output streams be lost?

I was meaning that in the case 1 (Run some code via a HTTP POST without RTC), the server needs to wait the end of the execution before sending back the result. In that case, the client receives the complete outputs at once, so can not show the user the progress in streaming. Does it make sense?

davidbrochart commented 1 month ago

Run some code via a HTTP POST without RTC

I don't see a lot of value in that.

Wh1isper commented 1 month ago

Run some code via a HTTP POST without RTC

In this mode, websocket (the current JupyterLab implementation) seems the only option

Wh1isper commented 1 month ago

Demo for datalayer/jupyter-server-nbmodel using RTC model to update the notebook model on the server side from the kernel outputs (with https://github.com/jupyterlab/jupyter-collaboration/pull/307)

Thanks for creating this awosome project!

I took a quick look at the code and left with two questions:

Feel free to discuss them with me anytime! :D

fcollonval commented 1 month ago

Update on the jupyter server extension:

Support stdin
Workish with ipywidgets

Prerequisites:

Execution of a notebook only from the left user - the right view is a RTC collaborator only viewing in the demo:

https://github.com/jupyter-server/jupyter_server/assets/8435071/bfd044a6-5fe2-4d76-8622-e0dfe12363b1

[!NOTE] The input prompt only appears for the user starting the execution request The fake output keeping track of the input value is added in the document by the left client (this is not ideal) ipywidget may or may not render at first execution (it smells like a race condition) as can be seen with the button and slider that appears in one of the client but not in both ipywidget are correctly instantiated when closing and reopening a document (small victory :v: )

I added some sequence diagrams to explain what is going on there.

jupyter-server / jupyter_server

New kernels REST API #900

Problem

Proposed Solution