microsoft / vscode-jupyter

VS Code Jupyter extension
https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter
MIT License
1.28k stars 289 forks source link

jupyter + ssh-remote: persistent kernels #15723

Closed nick-youngblut closed 3 months ago

nick-youngblut commented 4 months ago

Using vscode-jupyter on a remote machine with ssh-remote has the issue that all running kernels are killed if the ssh connection is lost. This is unlike running Jupyter Lab (or Notebook) directly on a remote machine (and connect via port forwarding), in which running kernels will persist, even if one loses their ssh connection. Due to this "killed-kernels" issue, large-scale data analysis is not practical with VS Code + Jupyter, since one must re-run their notebooks if they lose their ssh connection, and one cannot have long running code in their Jupyter notebooks (e.g., running over night).

Feature request: allow kernel jobs to persist even if the ssh connection is lost.

vishu-tyagi commented 3 months ago

I usually start a jupyter server in a tmux session for this

nick-youngblut commented 3 months ago

I usually start a jupyter server in a tmux session for this

@vishu-tyagi does tmux allow for persistent kernels (see above), while screen does not? That would be surprising.

vishu-tyagi commented 3 months ago

I usually start a jupyter server in a tmux session for this

@vishu-tyagi does tmux allow for persistent kernels (see above), while screen does not? That would be surprising.

Sorry, I haven't used screen but tmux works fine for me. I don't usually use notebooks where I have to keep the kernel alive for days, so I haven't tested that. But the issue you described, tmux will keep the jupyter kernel alive even if your ssh disconnects (whether or not you have running code). You can give it a try.

nick-youngblut commented 3 months ago

You are running the jupyter kernels on a remote server? In my case, I'm using a slurm cluster, so the workflow is as follows:

If VS Code loses the ssh connection to the particular node, all of the Jupyter kernels die, and then one must re-run the notebooks once they re-establish the ssh connection.

vishu-tyagi commented 3 months ago

When you ssh into the node, how are you starting the jupyter server? This is my workflow (might help).

vscode command palette >> Remote-SSH: Connect to Host... >> cd into directory where my_notebook.ipynb is saved >> $ tmux >> $ source .venv/bin/activate (activate virtual env) >> jupyter notebook >> copy the provided URL >> $ tmux detach >> code my_notebook.ipynb (opens the notebook in vscode) >> command palette >> Notebook: Select Notebook Kernel >> Existing Jupyter Server >> enter the URL.

Then, you can do some work in your notebook and save it, disconnect, and ssh again. You should see your saved work. I haven't tried this on a cluster, but don't think it should be any different.

nick-youngblut commented 3 months ago

So you are not starting up the Jupyter notebooks directly in VS Code. With VS Code, you don't have to run jupyter notebook as a separate process. It appears that you are running the Jupyter server outside of VS Code and then just interacting with the notebooks via VS Code.

DonJayamanne commented 3 months ago

@nick-youngblut @vishu-tyagi Please upvote this issue https://github.com/microsoft/vscode-jupyter/issues/3998 Closing as duplicate

nick-youngblut commented 3 months ago

@DonJayamanne sorry for not seeing that other issue. It has quite a few upvotes. Using Jupyter through VS Code is great (e.g., easy enabling copilot), but this issue with non-persisting kernels really limits what one can do with VS Code + Jupyter.

DonJayamanne commented 3 months ago

sorry for not seeing that other issue. It has quite a few upvotes

@nick-youngblut Absolutely no need to apologize