jupyterlab / jupyter-collaboration

A Jupyter Server Extension Providing Support for Y Documents
https://jupyterlab-realtime-collaboration.readthedocs.io/en/latest/
Other
157 stars 30 forks source link

Fix for code execution on the Jupyter Server #307

Closed fcollonval closed 2 months ago

fcollonval commented 4 months ago

Working on https://github.com/datalayer/jupyter-server-nbmodel, we are facing issues with the current code.

Needs https://github.com/jupyter/jupyter_client/pull/1023 Requires jupyter_client>=8.6.2

Code changes

github-actions[bot] commented 4 months ago

Binder :point_left: Launch a Binder on branch jupyterlab/jupyter-collaboration/fix/server-execution-jp-server

fcollonval commented 4 months ago

@davidbrochart @krassowski for the roomID, finally I went to store it directly in the shared document state. Sharing it there allows to retrieve it quickly and avoid the need to reconstruct it from its part (easier maintenance).

fcollonval commented 4 months ago

@davidbrochart could you confirm that the latest version will work with jupyverse?

davidbrochart commented 4 months ago

I will. BTW I see that you created a branch directly on https://github.com/jupyterlab/jupyter-collaboration. Would you mind creating a PR from your fork in the future?

davidbrochart commented 4 months ago

@davidbrochart @krassowski for the roomID, finally I went to store it directly in the shared document state. Sharing it there allows to retrieve it quickly and avoid the need to reconstruct it from its part (easier maintenance).

Then I think we should acknowledge this is now part of the document state, and merge https://github.com/jupyter-server/jupyter_ydoc/pull/198. I could rebase https://github.com/jupyterlab/jupyter-collaboration/pull/217 and merge it too.

davidbrochart commented 4 months ago

I'm realizing that what you set in the shared document's state is the room_id, which makes more sense than the file_id. What about renaming it to something more generic, e.g. document_id?

krassowski commented 4 months ago

document_id sounds slightly better to me.

Edit: but I am fine with going forward with file_id. I think the difference is negligible.

davidbrochart commented 4 months ago

The issue with file_id is that:

I'd like to not use room_id because it is a reference to pycrdt-websocket, which jupyter-ydoc is independent of. So I'd prefer document_id or just id.

fcollonval commented 4 months ago

Thank you both @davidbrochart and @krassowski for testing and reviewing.

When thinking at this with @echarles , we wanted to address primary https://github.com/jupyter-server/jupyter_server/issues/900 while keeping the kernel and the document model as decouple as possible.

The concern we have with adding the input or the pending requests within the document model is that it creates an issue with synchronizing the kernel state and the document model. In particular for the input, it seems to us that this will be very complex to resolve if the kernel is connected to a notebook and a console (or to multiple notebooks). For example, if the input snippet is sent from the console. Should the pending input kernel state be reflected in the notebook?

I'm wondering if a better path preserving as much as possible the decoupling by requesting the kernel state could be possible.

davidbrochart commented 4 months ago

I agree that adding requests to the shared document is the beginning of the kernel protocol leaking into CRDTs, which is exactly what we want to avoid. For me an input request should just be added as a new entry in the cell outputs. Its output_type should (ironically) be input, and a frontend observing outputs should show an input box if it sees an output of type input. Note that we will probably need to rework the outputs structure, like I started in https://github.com/jupyter-server/jupyter_ydoc/pull/201.

EDIT: let's call the output_type stdin instead of input, which would be confusing.

krassowski commented 4 months ago

A few times I encountered an issue when I executed a cell but the previous version of code was executed. This is because jupyter-server-nbmodel uses snippet = str(ycell["source"]) which can have a different state than the frontend that triggered the request (e.g. delayed by half a second and missing one character, e.g. leading to syntax error).

krassowski commented 4 months ago

Debugging this a little bit I see that the following lines in jupyter-server-nbmodel do not propagate to the frontend for me:

        with ycell.doc.transaction():
            del ycell["outputs"][:]
            ycell["execution_count"] = None

I see that ycell is defined, I am not sure why this does not get pushed to the frontend.

krassowski commented 4 months ago

Reloading the page is also stuck until the computation completes.

Calling input() also blocks everything for me, including execution in other notebooks. Is there maybe a requirement on specific version of jupyter-server or Python or something that is not defined in dependencies of jupyter_server_nbmodel?

Ah, I missed https://github.com/jupyter/jupyter_client/pull/1023 - sorry!

echarles commented 4 months ago

Calling input() also blocks everything for me, including execution in other notebooks. Is there maybe a requirement on specific version of jupyter-server or Python or something that is not defined in dependencies of jupyter_server_nbmodel?

@krassowski Thx for trying. Do you still block with https://github.com/jupyter/jupyter_client/pull/1023?

davidbrochart commented 4 months ago

I see that ycell is defined, I am not sure why this does not get pushed to the frontend.

I think you need https://github.com/jupyter-server/jupyter_ydoc/pull/201.

BTW @fcollonval @krassowski I opened https://github.com/jupyter-server/jupyter_ydoc/pull/233 to give an idea of what a truly collaborative and recoverable input widget would use.

krassowski commented 4 months ago

Do you still block with https://github.com/jupyter/jupyter_client/pull/1023?

No, it is all working great after picking up https://github.com/jupyter/jupyter_client/pull/1023. The bugs I now see are:

fcollonval commented 4 months ago

the latest pushed commit a207d33 reduced the changes to a minimum as a dedicated cell executor is going to be part of https://github.com/datalayer/jupyter-server-nbmodel/pull/14