jupyterlite / jupyterlite

Wasm powered Jupyter running in the browser 💡
https://jupyterlite.rtfd.io/en/stable/try/lab
BSD 3-Clause "New" or "Revised" License
3.84k stars 294 forks source link

Pressing "Restart and run all cells" randomly causes silent pyodide kernel crash #1464

Open ogrisel opened 3 weeks ago

ogrisel commented 3 weeks ago

Description

Pressing "Restart and run all" randomly causes silent pyodide kernel crashes as seen on the following screen recording:

jupyterlite_pyodide_crash.webm

Reproduce

  1. Go to https://jupyter.org/try-jupyter/lab/ in chrome or firefox
  2. Create a new notebook with a single cell that imports a builtin library such as six
  3. Execute the cell, in general there is no problem at this point.
  4. Press the "Restart and run all cells" but a few times
  5. At some point, the pyodide kernel crashes
  6. The browser dev console displays a message such as:
Trying to send message on removed socket for kernel a7355f7a-283a-4cdb-a37c-ee379fb1bfc7

Expected behavior

Context

Browser Output
Kernel: restarting (a7355f7a-283a-4cdb-a37c-ee379fb1bfc7) [default.js:1370:24](webpack://_JUPYTERLAB.CORE_OUTPUT/node_modules/@jupyterlab/services/lib/kernel/default.js)
Pyodide contents will be synced with Jupyter Contents [index.js:60:28](webpack://jupyterlite/pyodide-kernel-extension/lib/index.js)
Connection lost, reconnecting in 0 seconds. [default.js:1325:20](webpack://_JUPYTERLAB.CORE_OUTPUT/node_modules/@jupyterlab/services/lib/kernel/default.js)
Starting WebSocket: wss://jupyter.org/try-jupyter/api/kernels/a7355f7a-283a-4cdb-a37c-ee379fb1bfc7 [default.js:69:20](webpack://_JUPYTERLAB.CORE_OUTPUT/node_modules/@jupyterlab/services/lib/kernel/default.js)
TypeError: this is undefined
    isReady notebooklspadapter.js:187
    y utils.js:22
    y utils.js:20
    onKernelChanged notebooklspadapter.js:149
    c index.es6.js:555
    emit index.es6.js:513
    emit index.es6.js:112
    restartKernel sessioncontext.js:366
    restart sessioncontext.js:882
    execute index.js:1801
    execute index.es6.js:365
    onClick toolbar.js:1043
    o toolbar.js:667
    React 11
[notebooklspadapter.js:163:20](webpack://_JUPYTERLAB.CORE_OUTPUT/node_modules/@jupyterlab/notebook/lib/notebooklspadapter.js)
Loading micropip, packaging [pyodide.asm.js:10:93500](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loaded micropip, packaging [pyodide.asm.js:10:93796](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
History was unable to be retrieved [history.js:260:20](webpack://_JUPYTERLAB.CORE_OUTPUT/node_modules/@jupyterlab/notebook/lib/history.js)
Loading openssl, ssl [pyodide.asm.js:10:93500](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loaded openssl, ssl [pyodide.asm.js:10:93796](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loading sqlite3 [pyodide.asm.js:10:93500](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loaded sqlite3 [pyodide.asm.js:10:93796](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loading traitlets [pyodide.asm.js:10:93500](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loaded traitlets [pyodide.asm.js:10:93796](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
traitlets already loaded from default channel [pyodide.asm.js:10:93166](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
sqlite3 already loaded from default channel [pyodide.asm.js:10:93166](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loading Pygments, asttokens, decorator, executing, ipython, matplotlib-inline, prompt_toolkit, pure_eval, six, stack_data, wcwidth [pyodide.asm.js:10:93500](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Loaded Pygments, asttokens, decorator, executing, ipython, matplotlib-inline, prompt_toolkit, pure_eval, six, stack_data, wcwidth [pyodide.asm.js:10:93796](https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.asm.js)
Failed to fetch ipywidgets through the "jupyter.widget.control" comm channel, fallback to fetching individual model state. Reason: Control comm did not respond in time [327.68dbf8491690b3aff1e7.js:1:5783](https://jupyter.org/try-jupyter/extensions/@jupyter-widgets/jupyterlab-manager/static/327.68dbf8491690b3aff1e7.js?v=68dbf8491690b3aff1e7)
Trying to send message on removed socket for kernel a7355f7a-283a-4cdb-a37c-ee379fb1bfc7 3 [kernels.js:108:24](webpack://_JUPYTERLAB.CORE_OUTPUT/packages/kernel/lib/kernels.js)
ogrisel commented 3 weeks ago

I think I also triggered the same problem after chaining a manual restart (e.g. my using the 0-0 keyboard short cut) followed by a cell execution with the same import statements (import sklearn or import six or any other builtin package of pyodide), for instance via the shift-enter keyboard shortcut.

It seems that this is caused by a race condition: if you wait long enough after a manual restart and the first cell execution, then there is no problem (I think).

jtpio commented 2 days ago

Thanks @ogrisel for opening the issue :+1:

It seems that this is caused by a race condition: if you wait long enough after a manual restart and the first cell execution, then there is no problem (I think).

Right it looks like a race condition that would need to be fixed at some point.