denoland / deno

A modern runtime for JavaScript and TypeScript.
https://deno.com
MIT License
93.73k stars 5.21k forks source link

(jupyter) Deno crashes jupyter kernel when pressing stop button in Notebook #24491

Open zph opened 1 month ago

zph commented 1 month ago

Version: Deno 1.43.5 (but I can check later this week for other versions)

Reproduction Steps:

  1. Create jupyter notebook and open it in Vscode or web interface
  2. Select deno kernel
  3. Run a cell that sets a variable's value var server = "foo"
  4. Start long running command as a code block cell (eg a await sleep(10000) type fn)
  5. Press "Stop" button
  6. Deno kernel crashes and loses the internal state of var server image

Behavior in python jupyter kernel:

  1. Steps 1 through 5 are the same
  2. Step 6 will stop the cell without killing the kernel
  3. server variable's value will still remain available

I'm treating python as the reference behavior here and expecting that deno will generally treat that as the correct behavior.

Summary

Deno has different and undesirable behavior when executing a jupyter runbook with the deno kernel in that it doesn't tolerate and recover from a "stop" execution command, without a kernel restart.

Severity: mild/moderate usability issue (deno jupyter notebooks are usable but fragile if needing to stop a cell execution. In that case it requries replaying the notebook to that same point in time to rebuild the internal state). I still prefer deno based notebooks to python for the improvements on typing and dependency management, so I'm interested to help make the ecosystem better over time.

I'm not sure where to start debugging this, so I'm starting by reporting it and if I can find out why the behavior happens I'll put up a PR.

Thanks!

bartlomieju commented 1 month ago

Honestly I'm not sure how we can solve this. We can definitely terminate JS execution gracefully without causing an error, but then I see no way how we could resume execution with the state intact. (CC @rgbkrk do you happen to know the details of this works in Python?)

That said, a similar situation occurs when you "interrupt the kernel" - Deno stop the server, instead of only interrupting the running REPL session but maintaining ZeroMQ connection to the kernel. The latter seems to be an easier fix.

zph commented 1 month ago

@bartlomieju Thank you for the quick reply :). I thought about abstractly how to do this last night and only came up with complicated approaches that would need to involve holding the repl state in a cloned fashion from before each cell is executed... but it sounds messy.

eg

For feasibility, I haven't looked at whether this is viable to copy internal repl state around.