Collaborative architecture documentation

cschuet commented 1 year ago

I am trying to understand how the y-py integration works in Jupyter as I am generally interested in collaborative software and after some research, y.js seems like one of the best approaches I could find. Thanks for that btw!!

I read ypy_websocket, jupyterlab_collaboration and jupyter_ydoc and have a fair understanding now of how the document sync between clients and backend works.

One thing I was wondering: I got the impression that cell execution actions are communicated outside of the ypy based collaborative document sync. If those are two separate communication channels how are race conditions avoided? E.g. if I modify a cell and immediately execute, how is it made sure that my execution action is processed after the document has been synced on the server to my local state?

welcome[bot] commented 1 year ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

davidbrochart commented 1 year ago

You're right, cell execution goes through the kernel protocol over WebSocket. There is not much documentation about it, but maybe this PR can help understand. This is a different WebSocket than the WebSocket used for the Y-document synchronization. Let's take a look at what happens. When the user executes a cell, the execution request goes from the frontend to the backend through the kernel protocol WebSocket, and the server forwards it to the kernel over ZMQ sockets. The kernel executes the code and sends a reply back over ZMQ, which is forwarded by the server to the frontend over WebSocket. If the reply has some outputs, the frontend shows them and modifies the Y-document. Changes to the Y-document are sent to the backend, but this time through the Y-WebSocket. The server then forwards the changes to every other client of the document. As you can see, there is no race condition. Also, keep in mind that the magic of CRDTs is that it doesn't matter too much how changes are processed. Conflicts will always be resolved consistently and all clients will have the same shared document state, eventually. But this situation is certainly not optimal because there are unneeded round-trips. I see it as a transition until JupyterLab only deals with a shared document, and doesn't handle the kernel protocol directly. Jupyverse already has this new execution mode where the client only POSTs a cell execution order, and lets the backend talk to the kernel over ZMQ. Any output produced by the kernel are directly handled to modify the Y-document in the backend, which is automatically synced with the frontend. This has not officially landed in jupyter-server but there is an issue about it. There are still issues with Jupyter widgets, but here again I think that the solution is for them to use Ypy/Yjs as their synchronization infrastructure. Work has started in that direction in ypywidgets (see also https://github.com/jupyter-widgets/ipywidgets/issues/3695). I hope that helps clarify the picture, which I admit is not simple.

cschuet commented 1 year ago

Thanks for the explanation, David, and also the links and pointers. That gives me some more reading material.

I assume the execution request that goes from the frontend to the backend contains the code to be executed and not merely a reference to the cell to be executed. In the first case agreed there is no race condition.

Is that still the case in the new world of POST-execution-order? Or do you have to make sure here that all the Y-Doc changes the user created in the cell they pressed Ctrl+Return on have successfully propagated to the server before actually running execution?

I guess sending a snapshot of the document (or cell rather) that the user pressed Ctrl+Return on along with execution request, is most aligned with user expectations.

davidbrochart commented 1 year ago

I assume the execution request that goes from the frontend to the backend contains the code to be executed and not merely a reference to the cell to be executed.

Yes.

Is that still the case in the new world of POST-execution-order?

No, and in this case I agree that there could be a race condition indeed, that's a good point. I'm not sure we want to send the code along with the execution request though, since it would compete with the shared cell content and potentially create other issues. Maybe a solution would be to make the document execution state a shared property of the document, so that the execution request also goes through the Y-WebSocket instead of a parallel REST API?

jupyterlab / jupyter-collaboration

Collaborative architecture documentation #94