mwouts / jupytext

Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
https://jupytext.readthedocs.io
MIT License
6.65k stars 386 forks source link

v1.16.3 breaks *.py notebooks #1260

Closed st-bender closed 4 months ago

st-bender commented 4 months ago

Hi,

It looks like v1.16.3 has trouble opening text (.py) notebooks, failing with "Unreadable notebook". Downgrading to v1.16.2 solves it for me.

Edit: More details, all packages installed via conda/mamba, and opening in jupyterlab fails. Looks like the content manager ist not activated.

Some console output that might be relevant:

[W 2024-07-11 12:21:54.327 ServerApp] 400 GET /api/contents/notebooks/notebook.py?type=notebook&content=1&hash=1&1720693314262 (127.0.0.1): Unreadable Notebook: /home/user/notebooks/notebook.py NotJSONError("Notebook does not appear to be JSON: '# ---\\n# jupyter:\\n#   jupytext:\\n#    ...")                                                                    
[W 2024-07-11 12:21:54.327 ServerApp] wrote error: 'Unreadable Notebook: /home/user/notebooks/notebook.py NotJSONError("Notebook does not appear to be JSON: \'# ---\\\\n# jupyter:\\\\n#   jupytext:\\\\n#    ...")'
    Traceback (most recent call last):                                                                                          
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/jupyter_server/services/contents/handlers.py", line 155, in get                                                                                                           
        self.contents_manager.get(                                                                                              
    TypeError: build_jupytext_contents_manager_class.<locals>.JupytextContentsManager.get() got an unexpected keyword argument 'require_hash'

    During handling of the above exception, another exception occurred:                                                        

    Traceback (most recent call last):
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/tornado/web.py", line 1790, in _execute
        result = await result
                 ^^^^^^^^^^^^
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/jupyter_server/auth/decorator.py", line 73, in inner
        return await out
               ^^^^^^^^^
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/jupyter_server/services/contents/handlers.py", line 167, in get
        model = await ensure_async(
                ^^^^^^^^^^^^^^^^^^^
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/jupyter_core/utils/__init__.py", line 198, in ensure_async
        result = await obj
                 ^^^^^^^^^
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/jupyter_server/services/contents/filemanager.py", line 922, in get
        model = await self._notebook_model(path, content=content, require_hash=require_hash)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/jupyter_server/services/contents/filemanager.py", line 868, in _notebook_model
        nb, bytes_content = await self._read_notebook(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/miniconda3/envs/20211123_py39/lib/python3.11/site-packages/jupyter_server/services/contents/fileio.py", line 471, in _read_notebook
        raise HTTPError(
    tornado.web.HTTPError: HTTP 400: Bad Request (Unreadable Notebook: /home/user/notebooks/notebook.py NotJSONError("Notebook does not appear to be JSON: '# ---\\n# jupyter:\\n#   jupytext:\\n#    ..."))
mwouts commented 4 months ago

Hi @st-bender, thank you for reporting this!

In jupytext==1.16.3 we changed how the async contents managers are detected, and possibly this does not work in your environment. Could you show me how the Jupyter starting logs look like? Especially these lines?

[W 2024-07-11 20:52:54.677 ServerApp] [Jupytext Server Extension] Async contents managers like AsyncContentsManager are not supported at the moment (https://github.com/mwouts/jupytext/issues/1020). We will derive a contents manager from LargeFileManager instead.
[I 2024-07-11 20:52:54.677 ServerApp] [Jupytext Server Extension] Deriving a JupytextContentsManager from LargeFileManager

Also could I see what Jupyter package you have in your env? Mine are

$ conda list | grep jupyter
hatch-jupyter-builder     0.9.1                    pypi_0    pypi
jupyter-lsp               2.2.0           py311h06a4308_0  
jupyter_client            8.6.0           py311h06a4308_0  
jupyter_core              5.5.0           py311h06a4308_0  
jupyter_events            0.8.0           py311h06a4308_0  
jupyter_server            2.10.0          py311h06a4308_0  
jupyter_server_terminals  0.4.4           py311h06a4308_1  
jupyterlab                4.0.8           py311h06a4308_0  
jupyterlab_pygments       0.1.2                      py_0  
jupyterlab_server         2.25.1          py311h06a4308_0 

(If you don't mind, can you keep this env untouched so that we can test an update in your env?)

A last thing that would help, would be to print the value of parent_classes here: https://github.com/mwouts/jupytext/blob/c757eac964b09ee12acf3f8da4823d07217f134b/jupyterlab/jupyterlab_jupytext/__init__.py#L29

Thanks!

mwouts commented 4 months ago

@mahendrapaipuri , thinking of this, I am wondering if instead of checking the parent class, we should test whether the .get method of the current contents manager is async, like here: image

Let me know what you think of #1261 !

mwouts commented 4 months ago

@st-bender would you mind giving a try at the associated PR? It's either

HATCH_BUILD_HOOKS_ENABLE=true pip install git+https://github.com/mwouts/jupytext.git@fix_async_contents_manager_detection

or, if you don't have nodejs

pip install git+https://github.com/mwouts/jupytext.git@fix_async_contents_manager_detection

(but the second one won't give you the Jupytext Menu)

mahendrapaipuri commented 4 months ago

@mahendrapaipuri , thinking of this, I am wondering if instead of checking the parent class, we should test whether the .get method of the current contents manager is async, like here: image

Let me know what you think of #1261 !

Yes, your suggestion is more robust and simple. I think it we should figure out the cause of this error. Hopefully @st-bender can post more details about his environment and logs.

st-bender commented 4 months ago

Hi @st-bender, thank you for reporting this!

In jupytext==1.16.3 we changed how the async contents managers are detected, and possibly this does not work in your environment. Could you show me how the Jupyter starting logs look like? Especially these lines?

[W 2024-07-11 20:52:54.677 ServerApp] [Jupytext Server Extension] Async contents managers like AsyncContentsManager are not supported at the moment (https://github.com/mwouts/jupytext/issues/1020). We will derive a contents manager from LargeFileManager instead.
[I 2024-07-11 20:52:54.677 ServerApp] [Jupytext Server Extension] Deriving a JupytextContentsManager from LargeFileManager

Looks like those lines are absent with v1.16.3, they are there with v1.16.2.

Also could I see what Jupyter package you have in your env? Mine are

$ conda list | grep jupyter
hatch-jupyter-builder     0.9.1                    pypi_0    pypi
jupyter-lsp               2.2.0           py311h06a4308_0  
jupyter_client            8.6.0           py311h06a4308_0  
jupyter_core              5.5.0           py311h06a4308_0  
jupyter_events            0.8.0           py311h06a4308_0  
jupyter_server            2.10.0          py311h06a4308_0  
jupyter_server_terminals  0.4.4           py311h06a4308_1  
jupyterlab                4.0.8           py311h06a4308_0  
jupyterlab_pygments       0.1.2                      py_0  
jupyterlab_server         2.25.1          py311h06a4308_0 

Sure:

jupyter-lsp               2.2.5              pyhd8ed1ab_0    conda-forge
jupyter-server-mathjax    0.2.6              pyh5bfe37b_1    conda-forge
jupyter-server-proxy      4.3.0              pyhd8ed1ab_0    conda-forge
jupyter_client            8.6.2              pyhd8ed1ab_0    conda-forge
jupyter_console           6.6.3              pyhd8ed1ab_0    conda-forge
jupyter_contrib_core      0.4.0              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.2           py311h38be061_0    conda-forge
jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
jupyter_server            2.14.1             pyhd8ed1ab_1    conda-forge
jupyter_server_fileid     0.9.2              pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
jupyter_server_ydoc       0.8.0              pyhd8ed1ab_0    conda-forge
jupyter_ydoc              0.3.4              pyhd8ed1ab_0    conda-forge
jupyterlab                4.2.3              pyhd8ed1ab_0    conda-forge
jupyterlab-git            0.50.1             pyhd8ed1ab_1    conda-forge
jupyterlab_code_formatter 2.2.1              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_server         2.27.2             pyhd8ed1ab_0    conda-forge
jupyterlab_vim            4.1.3              pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        3.0.11             pyhd8ed1ab_0    conda-forge

A last thing that would help, would be to print the value of parent_classes here:

https://github.com/mwouts/jupytext/blob/c757eac964b09ee12acf3f8da4823d07217f134b/jupyterlab/jupyterlab_jupytext/__init__.py#L29

That I can try early next week, hope the environment is good enough for your testing for now.

Cheers.

st-bender commented 4 months ago

@st-bender would you mind giving a try at the associated PR? It's either

HATCH_BUILD_HOOKS_ENABLE=true pip install git+https://github.com/mwouts/jupytext.git@fix_async_contents_manager_detection

or, if you don't have nodejs

pip install git+https://github.com/mwouts/jupytext.git@fix_async_contents_manager_detection

(but the second one won't give you the Jupytext Menu)

Thanks for the options, I'll get around trying them out early next week. In the meantime I hope you can replicate the issue with the environment I posted. I noticed that some packages I have are on a newer version.

Cheers.

mahendrapaipuri commented 4 months ago

Cheers @st-bender for the details.

@mwouts I think the issue comes from the outdated JupytextContentsManager. Seems like in jupyter_server==2.11, a new keyword argument require_hash has been added to get method of ContentsManager. Current JupytextContentsManager.get do not take this argument and hence, the error TypeError: build_jupytext_contents_manager_class.<locals>.JupytextContentsManager.get() got an unexpected keyword argument 'require_hash'.

I think we need to update the method signatures and add tests to test the overloaded methods from base ContentsManager.

st-bender commented 4 months ago

Hi @mahendrapaipuri

Thanks for digging. This sounds indeed like the cause. In that case could you simply use **kwargs and pass that on to the calls to self.super.get(path, content, type, format)? Could make it more future-proof.

st-bender commented 4 months ago

Hi @mwouts @mahendrapaipuri

I can confirm that #1261 fixes it for me.

Hope this helps, cheers.

mwouts commented 4 months ago

Hi @st-bender , @mahendrapaipuri , thank you both for your feedback!

Since the PR appears to solve that problem, I've merged it and released jupytext==1.16.4.

Re the require_hash argument, that was a problem (for async contents manager) only with jupyter_server==2.11.0. In jupyter_server==2.11.1 the developers were kind enough to detect whether the argument is supported (but possibly they did that only for sync contents managers). Still I agree it would be great to support it. There is an open issue (https://github.com/mwouts/jupytext/issues/1165), where if I remember correctly I need to figure out the proper way to combine two or more hashes.

Possibly we see the error with jupytext==1.16.3 in @st-bender 's environment as we are providing a (faulty) sync contents manager.

Also the above did not confirm it, but possibly the detection of the sync contents manager failed here because the parent class was something else than jupyter_server.services.contents.manager.AsyncContentsManager. I remember that at some point the contents manager were coming from another package like jupyter_notebook (but I agree I didn't see that one in @st-bender 's env)

mwouts commented 4 months ago

Hi @mahendrapaipuri , I see that the publish action failed. It's the second time I see this, so I was thinking of going back to a simpler publish action (i.e. just publish, don't run any tests), would that be fine with you?

mahendrapaipuri commented 4 months ago

Also the above did not confirm it, but possibly the detection of the sync contents manager failed here because the parent class was something else than jupyter_server.services.contents.manager.AsyncContentsManager. I remember that at some point the contents manager were coming from another package like jupyter_notebook (but I agree I didn't see that one in @st-bender 's env)

Yeah, I think that was some sort of edge case that trigged the error. Anyways #1261 is more reliable way to detect the contents manager and I think we should be ok with that.

Hi @mahendrapaipuri , I see that the publish action failed. It's the second time I see this, so I was thinking of going back to a simpler publish action (i.e. just publish, don't run any tests), would that be fine with you?

Yes, we can do that. Anyways, we need to add retries config for playwright tests so that failed tests will be retried in the CI. Most of time they are just transient failures.

st-bender commented 4 months ago

Hi @mwouts

Thanks a lot, much appreciated.

OT: Wrt the hash stuff, there are some methods how to combine hashes "properly". If you find the time, maybe check the responses in https://stackoverflow.com/questions/5889238/why-is-xor-the-default-way-to-combine-hashes

Cheers.