Open davidbrochart opened 2 years ago
For the back-end to be able to make changes to a shared document based on some code execution, the back-end has to know which (part of a) document the code is linked to. For instance in a notebook, a cell input contains code that will produce outputs in the same cell. This means that this REST API must provide information about the document to modify, as well as the kernel to use. A notebook cell "POST execute request" should consist of the following body:
{
"document_id": "path/to/notebook.ipynb",
"cell_id": "b4d4a3fe-bc05-4965-b5eb-94732e4329ae",
"kernel_id": "c71c23dd-38ba-4967-aa46-415ba7317012"
}
I would be in favor of extending the current kernels API in core to include a new REST endpoint for executing code. Something like
POST /api/kernels/<kernel_id>/execute
Body:
{
"document_id": "path/to/notebook.ipynb",
"cell_id": "b4d4a3fe-bc05-4965-b5eb-94732e4329ae",
"kernel_id": "c71c23dd-38ba-4967-aa46-415ba7317012",
"code": ...
}
Response:
{
"document_id": "path/to/notebook.ipynb",
"cell_id": "b4d4a3fe-bc05-4965-b5eb-94732e4329ae",
"kernel_id": "c71c23dd-38ba-4967-aa46-415ba7317012"
}
This wouldn't require a websocket to the kernel for two-way comms. It simply sends an execute_request to the kernel. On the server, this request would be handled by the KernelManager and formulated into a proper ZMQ message to be sent across kernels shell channel.
I don't think we should replace any of the the current kernel's REST/Websocket API, though. The kernel websocket APIs are still heavily used by projects other than JupyterLab. We don't want those endpoints to degrade or deprecate anytime soon.
As for checking kernel status, I think it also makes sense to add a GET /api/kernels/<kernel_id>/status
endpoint that sends a message to IOPub for kernel status and wait for a response. The event bus websocket is also a good way to shuttle these messages.
This work is happening in https://github.com/jupyter-server/jupyverse/pull/191.
There is no need for the front-end to e.g. interpret IOPub messages in order to display results,
What would the impact for the Comm messages used by IPyWidgets?
That's a good point, currently ipywidgets won't work with this, but there are discussions about using y-py in ipywidgets, which would be very aligned with this REST API in my opinion.
but there are discussions about using y-py in ipywidgets
Any pointers? Any backwards compatibly requirement taken in that discussion?
I think it has been discussed in some meetings, but it's probably not going to happen anytime soon. Regardless of ipywidgets, this kernels REST API could make it easier for new collaborative front-ends which don't want to bother about low-level kernel protocols. I think it can lower the bar quite a lot in that regard.
I think it would be important to make it possible for clients to (easily) combine RTC experience with Comms. Having an example of how to do this would be helpful (but I agree that it might not be needed in the first iteration).
I opened an issue in ipywidgets.
I implemented a simple generic client based on the websocket interface (I tried using jupyter_client
's zmq interface, but it was hard to use), and then implemented a plugin with fileid support to implement this functionality.
client: https://github.com/Wh1isper/jupyter_kernel_client extension: https://github.com/Wh1isper/jupyter_kernel_executor
Looking forward to your suggestions ! @davidbrochart @Zsailer
I would strongly suggest you use the AsyncMappingKernelManager
. That way your kernels will be managed by the ServerKernelManager
where you'll be able to leverage future functionality.
We have started the work for an jupyter_server extension at https://github.com/datalayer/jupyter-server-nbmodel
This repo doesn't seem to be public?
Thanks @davidbrochart for the ping - it should now be public
Confirmed, thanks! @Zsailer might be interested too.
Looks like the current implementations target two goals:
Do we want to serve those 2 goals via the same endpoint or should we separate them. (PS: This may have impact on the existing jupyterlab implementation).
- Run some code via a HTTP POST endpoint (loosing the "streaming" feature of the output).
Why would output streams be lost?
Demo for https://github.com/datalayer/jupyter-server-nbmodel using RTC model to update the notebook model on the server side from the kernel outputs (with https://github.com/jupyterlab/jupyter-collaboration/pull/307)
Why would output streams be lost?
I was meaning that in the case 1 (Run some code via a HTTP POST without RTC), the server needs to wait the end of the execution before sending back the result. In that case, the client receives the complete outputs at once, so can not show the user the progress in streaming. Does it make sense?
Run some code via a HTTP POST without RTC
I don't see a lot of value in that.
Run some code via a HTTP POST without RTC
In this mode, websocket (the current JupyterLab implementation) seems the only option
Demo for datalayer/jupyter-server-nbmodel using RTC model to update the notebook model on the server side from the kernel outputs (with https://github.com/jupyterlab/jupyter-collaboration/pull/307)
Thanks for creating this awosome project!
I took a quick look at the code and left with two questions:
Feel free to discuss them with me anytime! :D
Update on the jupyter server extension:
Prerequisites:
Execution of a notebook only from the left user - the right view is a RTC collaborator only viewing in the demo:
https://github.com/jupyter-server/jupyter_server/assets/8435071/bfd044a6-5fe2-4d76-8622-e0dfe12363b1
[!NOTE] The input prompt only appears for the user starting the execution request The fake output keeping track of the input value is added in the document by the left client (this is not ideal) ipywidget may or may not render at first execution (it smells like a race condition) as can be seen with the button and slider that appears in one of the client but not in both ipywidget are correctly instantiated when closing and reopening a document (small victory :v: )
I added some sequence diagrams to explain what is going on there.
Problem
Currently, the kernel protocol over ZMQ in the back-end is "forwarded" to the front-end over a WebSocket. This brings a lot of complexity to the front-end, which has to speak this protocol. With JupyterLab moving to RTC, the (notebook) UI becomes merely a shared document editor: the user enters some text in a cell, asks for execution, sees outputs being populated, clears outputs... There is no need for the front-end to e.g. interpret IOPub messages in order to display results, all this could be done in the back-end and directly modify the shared document, which would automatically update in the front-end.
Proposed Solution
We could create a much more simple REST API for kernels, which would consist of e.g. "POST execute request", etc. All kernel state information (idle, busy, dead, restarting...) could go to the new event system.