jupyter / jupyter_client

Jupyter protocol client APIs
https://jupyter-client.readthedocs.io
BSD 3-Clause "New" or "Revised" License
381 stars 283 forks source link

An exception occurred when the kernel restarted, causing it to keep restarting #734

Open icankeep opened 2 years ago

icankeep commented 2 years ago

image

kernel is restarting

image

[E 2022-01-10 17:45:38.522 MtSingleUserLabApp handlers:82] Exception restarting kernel
    Traceback (most recent call last):
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_server/services/kernels/handlers.py", line 80, in post
        await km.restart_kernel(kernel_id)
      File "/conda/envs/notebook/lib/python3.6/site-packages/mtjupyter_singleuser/kernelmanager.py", line 92, in restart_kernel
        await super(AsyncMappingKernelManager, self).restart_kernel(kernel_id, now)
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_server/services/kernels/kernelmanager.py", line 384, in restart_kernel
        await ensure_async(self.pinned_superclass.restart_kernel(self, kernel_id, now=now))
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_server/utils.py", line 189, in ensure_async
        result = await obj
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_client/manager.py", line 478, in _async_restart_kernel
        await ensure_async(self.shutdown_kernel(now=now, restart=True))
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_client/utils.py", line 34, in ensure_async
        return await obj
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_client/manager.py", line 434, in _async_shutdown_kernel
        await ensure_async(self.interrupt_kernel())
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_client/utils.py", line 34, in ensure_async
        return await obj
      File "/conda/envs/notebook/lib/python3.6/site-packages/jupyter_client/manager.py", line 542, in _async_interrupt_kernel
        raise RuntimeError("Cannot interrupt kernel. No kernel is running!")
    RuntimeError: Cannot interrupt kernel. No kernel is running!
icankeep commented 2 years ago

I don't know how to reproduce this situation.

I think if the kernel is not found, it should not throw an exception, but just print the log, otherwise it will affect the startup process

kevin-bates commented 2 years ago

I don't know how to reproduce this situation.

Might this be the result of a subsequent restart request, while the first was in the process of completing?

I think if the kernel is not found, it should not throw an exception, but just print the log, otherwise it will affect the startup process

This seems reasonable, although I'd be inclined to check self.has_kernel in the shutdown method, prior to its sequence of interrupt, request-shutdown, etc. Perhaps in such cases (when shutdown_kernel encounters a false has_kernel value), we simply log a warning and return, since there really isn't much else we can do (although I haven't looked closely relative to that last portion).

@Zsailer is also applying "pending-state" support to the shutdown operation so he may have some insights here as well.

Zsailer commented 2 years ago

I've seen this error with slow starting kernels. If you try to restart a kernel that takes a long time to start (this is often the case with remote kernels), you'll get this error.

I agree, we can likely log this issue without raising an exception. In a "pending kernels" world, proposed by #732, you won't run into this issue. Restarting a slow starting kernel will be blocked.