databrickslabs / jupyterlab-integration

DEPRECATED: Integrating Jupyter with Databricks via SSH
Other
71 stars 12 forks source link

Read local files from notebooks connecting to Databricks via SSH #17

Closed tsuting closed 3 years ago

tsuting commented 4 years ago

Hello, I was wondering if there is any way that notebooks connecting to Databricks via SSH could read files on local machine.

I have a yaml file and a notebook on my local side. I opened the notebook from jupyterlab connecting to Databricks via SSH and the notebook tried to read the yaml file but it did not work. Because the yaml file did not exist on Databricks. What I could figure out this is uploading yaml file to dbfs system so that the notebook can read the yaml file from there. Is there any better way to do that?

Thanks.

bernhard-42 commented 4 years ago

Sorry for the late answer, been on vacation. Agreed, this is one of the annoyances of working on a remote cluster with Jupyter. The notebooks are nicely local, however every read of files will be done on the remote Python kernel, hence cannot read local files. I don't have a nice solution for that, unfortunately. I also use databricks cli to copy this type of file to DBFS

obar1 commented 3 years ago

hi @tsuting
if using db in azure you can sync the local files to a blob storage ( naively just put in bash with a loop doing that every N sec azcopy sync ...)

so every local change will be there

and then read from the blob

you can mount it in dbfs