microsoft / vscode-jupyter

VS Code Jupyter extension
https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter
MIT License
1.25k stars 275 forks source link

Syncing local python project files with remote Jupyter server #1601

Open brunocous opened 4 years ago

brunocous commented 4 years ago

Feature: Notebook Editor, Interactive Window, Python Editor cells

Description

Microsoft Data Science for VS Code Engineering Team: @rchiodo, @IanMatthewHuff, @DavidKutu, @DonJayamanne, @greazer

Context

VSCode allows users to connect with a running remote Jupyter server (https://code.visualstudio.com/docs/python/jupyter-support#_connect-to-a-remote-jupyter-server). Using the Jupyter API it is able to start a kernel and execute cells of a locally saved notebook remotely.

Use case/problem

If you want to for example call a function from another python file (foo.py) from that notebook (local-notebook.ipynb), the remote kernel can't access local files. The remote kernel can access other files saved on the remote notebook server. The problem is that local (Python) files are not synced with the remote Jupyter server instance, for the remote Python interpreter (kernel) to access them.

Existing solutions

The standard way of achieving this is through rsync over SSH. However this default requires managing a SSH connection and SSH keys (which large entreprises servers not necessarily allow). There are workarounds (manually uploading files through the notebook UI, and using git), but these inhibit development and iteration speed.

Proposal

Extend the VSCode Python extension to allow users to sync files with a remote running Jupyter notebook server. Under the hood, the Jupyter contents API can be used for this:

Authentication and authorization is handled through the API token that you need anyway to connect.

No SSH, git or manual hassle required.

Additionally, you can execute your local code (by calling it through the notebook) remotely without having to manage a remote Python SSH interpreter, or docker images. All you need is a running jupyter notebook.

IanMatthewHuff commented 4 years ago

@brunocous Thanks for the detailed suggestion. I do feel that this would be an interesting suggestion to consider. We'll discuss it at our triage meeting.

brunocous commented 4 years ago

Any news on this, or how can I help?

brunocous commented 4 years ago

Btw, there are other workarounds to this when you use a cloud provider anyway in your project (S3, data storage, blob,...).

For example, using AWS S3 you could perform aws s3 sync to sync local files to an S3 bucket. And then use a jupyterplugin(like https://github.com/uktrade/jupyters3) to sync that S3 bucket with Jupyter. Something similar probably exists for other cloud providers

CmdQ commented 3 years ago

I'd find this feature highly useful. The other ways are always cumbersome.

fra-luc commented 3 years ago

I'd like this very much as well.

GF-Huang commented 3 years ago

Any progress?

rchiodo commented 3 years ago

This is something we're investigating. Not sure if or when we'll release it though.

205g0 commented 3 years ago

Just found this issue and indeed, this would be a feature making the entire experience more round.

The Jupyter remote server is on abstract level just a dumb number cruncher I use because:

The current separation is jarring (one file local but rest remote) and breaks an unmatched feature of VS Code.

Would love to see an update on this

brunocous commented 3 years ago

I'm already pleased that the VSCode team is even considering this feature. I dropped the same feature request for Pycharm some time ago, but not a single response or action was taken (https://youtrack.jetbrains.com/issue/PY-42649). So good job VS code team!
Even this feature gets a green light, it will take some non-trivial dev effort. For now, there are some suboptimal workarounds that gives you the same result (rsync, scripts that do something with git, syncing with any cloud blob or object store, etc).

tsuga commented 2 years ago

Is there any update for this long-wanted feature request? Or is it dead in the water?

I'm seeing @brunocous's proposal (copied below for your convenience) in https://youtrack.jetbrains.com/issue/PY-42649 is promissing.

Proposal Extend the Pycharm Jupyter extension to allow users to sync files with a remote running Jupyter notebook server. Under the hood, the Jupyter contents API can be used for this:

rchiodo commented 2 years ago

@tsuga you can track our iteration plans here: https://github.com/microsoft/vscode-jupyter/issues/7008

Every month our plans will show up as a pinned item at the top of our issues.

Additionally things we plan on working on in the next month or so will have a milestone appended.

This item is on neither, so it's not on the radar at the moment. It would likely need more upvotes to push up the queue of stuff we're looking at.

tsuga commented 2 years ago

@rchiodo Thank you for your follow up! Where can we upvote this?

rchiodo commented 2 years ago

@rchiodo Thank you for your follow up! Where can we upvote this?

At the top. The upvotes under the main description are tracked as 'votes' for an item.

alfredodeza commented 2 years ago

This feature is critical to consider using an Azure remote compute instance as a viable option. I understand that it is currently not under any current plans, but would love to see this one move along.

In the meantime, would it be possible to have some sort of recommendation on how to sync local files to a remote instance? That would help alleviate the problem of not having something built-in. I'm happy to contribute documentation on it if that needs to happen

rchiodo commented 2 years ago

@alfredodeza thanks for the upvote. There's no recommended way to sync files other than for you to put them in the same folder where you started the remote server (that will make the relative paths work correctly).

jcnelson30 commented 1 year ago

Anyone have any workarounds for how they setup the rsync to accommodate this issue?

I enjoy the data interaction of Jupyter notebooks but not being able to import some of my shared python code makes development extremely tedious.

I have a lot of floating "old-function-versions" due to having to paste each function directly into the Jupyter notebook to execute my long running tasks on my server w/ a beefy gpu

nttoan26 commented 9 months ago

this feature will be very helpful

DonJayamanne commented 9 months ago

@nttoan26 would it be useful if you could just edit the remote files? I.e. assume we displayed a file explorer that displayed all of the remote files and you could edit them in VS Code. However what this means is if you have a local file that will not get uploaded/synced. Instead you can just create the file on the remote file explorer directly.

Would that work?

Would this address your needs https://github.com/microsoft/vscode-jupyter/issues/1366

AngeValli commented 8 months ago

I think this idea would work. The need here is to have the file explorer in line with the remote Jupyter kernel used in VSCode. For the moment, it is mandatory to perform separately an SSH connection to see the files from the remote server on the file explorer and to connect on the remote Jupyter kernel, for example using JupyterHub’s REST API Token. Using the API Token for accessing remote files in the file explorer would solve the issue.