jupyterlab / jupyter-collaboration

A Jupyter Server Extension Providing Support for Y Documents
https://jupyterlab-realtime-collaboration.readthedocs.io/en/latest/
Other
157 stars 30 forks source link

No tests for notebooks, fixture does not work when `store = True` #274

Closed krassowski closed 2 months ago

krassowski commented 6 months ago

Description

Heads up to anyone trying to improve test coverage:

I tried to write tests for #270 using rtc_create_notebook and rtc_create_file fixtures. While I was successful with rtc_create_file, tests with rtc_create_notebook appear to stall if using store=True.

I also noticed that this fixture is not used, not even once in the tests. It looks like it needs some attention.

https://github.com/jupyterlab/jupyter-collaboration/blob/7898b2548b3256238169e90affa05a68cbe43c2d/tests/conftest.py#L78-L110

It stalls on line 106:

await rtc_add_doc_to_store("json", "notebook", path)
krassowski commented 6 months ago

It looks like the observe event/signal from YNotebook does not fire the callback:

https://github.com/jupyterlab/jupyter-collaboration/blob/7898b2548b3256238169e90affa05a68cbe43c2d/tests/conftest.py#L152-L168

Removing target check does not help - there is no callbacks at all.

YNotebook.observe is implemented in jupyter-ydoc, here:

def observe(self, callback: Callable[[str, Any], None]) -> None:
    """
    Subscribes to document changes.

    :param callback: Callback that will be called when the document changes.
    :type callback: Callable[[str, Any], None]
    """
    self.unobserve()
    self._subscriptions[self._ystate] = self._ystate.observe(partial(callback, "state"))
    self._subscriptions[self._ymeta] = self._ymeta.observe_deep(partial(callback, "meta"))
    self._subscriptions[self._ycells] = self._ycells.observe_deep(partial(callback, "cells"))

I am not familiar with the lower-level codebase enough to debug this further easily. Any thoughts?

davidbrochart commented 6 months ago

I didn't write these fixtures, so I'm also not very familiar with them. Regarding YNotebook.observe, there has been changes recently in the way yrs handles them, and I updated pycrdt accordingly. Maybe worth trying again? I'll try and check on a simple example and report back.

davidbrochart commented 6 months ago

This shows that observing notebook changes works as expected:

from jupyter_ydoc import YNotebook

ynb = YNotebook()

def callback(target, events):
    print(f"{target=}")
    for event in events:
        print(f"{str(event)=}")

ynb.observe(callback)
ynb.set(
    {
        "cells": [],
    }
)

# target='cells'
# str(event)='{target: [{"execution_count":null,"metadata":{"trusted":true},"source":"","id":"c6506805-eec8-496d-9ec1-28d7239bf0b8","cell_type":"code","outputs":[]}], delta: [{\'insert\': [<pycrdt._map.Map object at 0x7f9a0c5d84a0>]}], path: []}'
# target='meta'
# str(event)='{target: {"metadata":{"language_info":{"name":""},"kernelspec":{"name":"","display_name":""}},"nbformat":4.0,"nbformat_minor":5.0}, keys: {\'metadata\': {\'action\': \'add\', \'newValue\': <pycrdt._map.Map object at 0x7f9a0c5d92b0>}, \'nbformat_minor\': {\'action\': \'add\', \'newValue\': 5.0}, \'nbformat\': {\'action\': \'add\', \'newValue\': 4.0}}, path: []}'
krassowski commented 6 months ago

I think the problem is not that observing does not work when a change is made via set, but that the test fixture assumes that an event is emitted to the observer after establishing an initial connection. I think there might be a difference in behaviour between files and notebooks here.

krassowski commented 2 months ago

It looks like the problem is possibly not with the fixture, but with the logic itself. I see that the content is loaded for the notebook:

WARNING  ServerApp:manager.py:741 Notebook test.ipynb is not trusted
INFO     ServerApp:rooms.py:158 Content in room json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd loaded from file test.ipynb
DEBUG    ServerApp:yutils.py:111 Received SYNC_STEP1 message from endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:yutils.py:120 Sending SYNC_STEP2 message to endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:yutils.py:111 Received SYNC_STEP2 message from endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:yutils.py:111 Received SYNC_UPDATE message from endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:fileio.py:267 Reading path from disk: test.ipynb
DEBUG    ServerApp:fileio.py:267 Reading path from disk: test.ipynb

In code it hits this path:

https://github.com/jupyterlab/jupyter-collaboration/blob/bcc36e1a451aaa5dbfbe020db72070149bd934bd/projects/jupyter-server-ydoc/jupyter_server_ydoc/rooms.py#L155-L165

I am a bit confused as to why encode_state_as_update is called and the update is not applied later.

I think this might be the error - the state is not set with .set() but by overriding self._document.source directly. This means that the observer callback is never fired.

krassowski commented 2 months ago

It looks like this logic traces back to https://github.com/jupyterlab/jupyter-collaboration/pull/2.

It looks like in ytstore implementation this method does apply the update:

https://github.com/jupyter-server/pycrdt-websocket/blob/20c2f388e27badc7eaf6369381be9bec6b0b5c28/pycrdt_websocket/ystore.py#L138-L145

Anyways, that looks like a wrong avenue to pursue. The bug is in the different between YNotebook and YUnicode...

krassowski commented 2 months ago

I think I got it. There is no .emit() because pycrdt ignores empty updates:

https://github.com/jupyter-server/pycrdt/blob/32a73975edac0a571d7b63512d8e2df88720f136/python/pycrdt/_sync.py#L107-L114

We can trigger the same issue with a text file by passing an empty string (equal to default content):

async def test_get_document_file2(rtc_create_file, jp_serverapp):
    path, content = await rtc_create_file("test.txt", "", store=True)  # will timeout
    collaboration = jp_serverapp.web_app.settings["jupyter_server_ydoc"]
    document = await collaboration.get_document(
        path=path, content_type="file", file_format="text"
    )
    assert document.get() == content == ""
    await collaboration.stop_extension()
krassowski commented 2 months ago

In addition, the target which fires for notebooks is not source but state, meta, cells.

krassowski commented 2 months ago

Changing the fixture to:

@pytest.fixture
def rtc_add_doc_to_store(rtc_connect_doc_client):
    event = Event()

    async def _inner(format: str, type: str, path: str) -> None:

        def _on_document_change(target: str, e: Any) -> None:
            expected_target = "cells" if type == "notebook" else "source"
            if target == expected_target:
                event.set()

        if type == "notebook":
            doc = YNotebook()
        else:
            doc = YUnicode()

        doc.observe(_on_document_change)

        async with await rtc_connect_doc_client(
            format, type, path
        ) as ws, WebsocketProvider(doc.ydoc, ws):
            await event.wait()
            await sleep(0.1)

    return _inner

Appears sufficient.