jupyter / help

:sparkles: Need some help or have some questions? Please visit our Discourse page.
https://discourse.jupyter.org
291 stars 97 forks source link

Serious security issue for using Jupyter notebook server on a multi-user cluster #138

Open lsuhpchelp opened 7 years ago

lsuhpchelp commented 7 years ago

Dear Jupyter Developers:

We manage a multi-user HPC cluster. Several users of this cluster are asking for running Jupyter notebook server on the cluster and access the server via their laptop browser. It is relatively a simple procedure to connect the browser to the server through the ssh tunnel by following the ssh tunnel procedure described in this link (http://ipyrad.readthedocs.io/HPC_Tunnel.html)

However, this procedure is not secure. The general step for using jupyter notebook server on an HPC cluster, is that user first login on to a login node (hpc_login_node) via ssh, then access the compute node by submitting an interactive job. When the user is on a compute node. (e.g. computenode001), he then start the jupyter notebook server with the following commands:

using port 8181 for ssh tunneling

computenode001$ NOTEBOOKPORT=8181

open a tunnel between compute and login nodes on port 8181, $PBS_O_HOST is the login node loginnode1

computenode001$ ssh -N -X -f -R $NOTEBOOKPORT:localhost:$NOTEBOOKPORT $PBS_O_HOST

launch notebook

computenode001$ jupyter notebook --no-browser --port=$NOTEBOOKPORT --notebook-dir=$PWD

On user's laptop, he uses another ssh tunnel to port forwarding the information receives on port 8181 on the hpc_login_node to his local port 8181.

user@local$ ssh -N -L 8181:localhost:8181 user@hpc_login_node.edu

After this ssh tunnel is created, the user can then open a browser to http://localhost:8181 to access the notebook server on the cluster.

This procedure works, however, any other users logging on the cluster can also access the hpc_login_node via the 8181 port by creating the same ssh tunnel and access that user's notebook server, and also access the user's folder using the jupyter notebook's terminal http://localhost:8181/terminals/1, this will creates a serious security problem within the clusters as users can read, change and delete other user's files with no restrictions.

Although we can advise the user running the notebook server to use a password to protect their notebook like the one described in http://jupyter-notebook.readthedocs.io/en/latest/public_server.html#securing-a-notebook-server, however, the password is not enabled by default and the current default behavior is running the server with no password, for new users it is more likely to run a password-less notebook server which will make their accounts extremely vulnerable, thus as administrators we will have to remove the jupyter notebook from the anaconda installation.

I am wondering if there is any way to enforce enabling the password so that we can allow users to run the server on our cluster?

Thanks a lot for your help!

takluyver commented 7 years ago

This was actually something that we changed in notebook version 4.3. By default, when you launch the notebook server, it generates a random token which the browser needs to authenticate. Users starting these new versions of the server will see a message like this in the terminal:

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8889/?token=91d34f6a52fd15b003adaf6ff156b50d359d930e3d61be89

The cross-user access issue you describe is one of the key reasons we implemented this. It can be something of an inconvenience, however; we try to minimise this by passing the token to the browser we launch, but that doesn't help in your case where the server is running on a server elsewhere.

If users manually set up a password, this replaces the token authentication.

lsuhpchelp commented 7 years ago

Hi, takluyver:

Thank you for your response, I installed the new version of anaconda from here:

https://repo.continuum.io/archive/Anaconda3-4.3.0-Linux-x86_64.sh

Then updated the jupyter-notebook by

python -m pip install notebook --pre --upgrade

so that the jupyter notebook is upgraded to 4.4.1

[fc@node002 fc]$ jupyter-notebook --version 4.4.1 [fc@node002 fc]$ jupyter-notebook --no-browser [I 14:26:08.629 NotebookApp] Serving notebooks from local directory: /project/fc [I 14:26:08.629 NotebookApp] 0 active kernels [I 14:26:08.629 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/ [I 14:26:08.629 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

It seems the random token is not used in version 4.4.1? Am I missing some procedures here?

Thanks again for your help!

Feng

lsuhpchelp commented 7 years ago

In addition, if the token is enabled by anaconda by default, is it possible for a 'smart and lazy' user to get around it which will again impose the cross-user access threat problem?

Feng

takluyver commented 7 years ago

I'm not sure what's going on there. There's no special procedure to use token auth. Do you have any config files that might be affecting it? jupyter --paths will show you where config files might be. I've just tried with 4.4.1, and it shows me:

[I 20:54:15.960 NotebookApp] The Jupyter Notebook is running at: http://localhost:8889/?token=6210d50a32b5b73afaac12b2d7a0c3280efa2083ee95802e

It is possible to disable the token authentication mechanism; we endeavour to make it clear that this is a bad idea. Even if there was no official way to do it, a sufficiently smart & lazy user could patch out the code that checks tokens.

lsuhpchelp commented 7 years ago

Hi, takluyver:

Thanks for your response, sorry I checked my "~/.jupyter/jupyter_notebook_config.py" and found I just added a password myself last night which disabled the token:---O

When you say patch out the code that checks the tokens, do you mean change the code in the anaconda python installation?

Thanks,

takluyver commented 7 years ago

Yep, I did mean editing that code. As it's written in Python, it's not hard to edit the code, and you don't need to compile it or anything. But maybe if you're providing Anaconda for all users then it's not in a user-editable location?

However, to be clear, it is also possible to disable token authentication from a config file. We try to funnel people who want to get rid of the token to using a password instead, but if they really want to turn off all security, they can.

lsuhpchelp commented 7 years ago

Hi, takluyver:

Yes, the anaconda installation in our cluster is not editable, so this might not be the primary concern at this moment, currently I found an easy way of getting around the password is to use an empty password in the c.NotebookApp.password = 'xxx', however that might not be controllable anyway.

Overall as system administrators we do hope jupyter notebook to be single user only, as cross-user access will definitely create a bunch of security issues in a multi-user environment, I saw the jupyterhub project (https://github.com/jupyterhub/jupyterhub), will that aiming at multi-user environment? In the future, will jupyter designed as single user only and jupyterhub for multi-users?

takluyver commented 7 years ago

Yes, the notebook package is intended for single user use, and jupyterhub is for multi-user servers. The latter is a server that runs and handles user authentication, starting individual notebook servers for each authenticated user, and connecting them with a proxy.

One way to discourage your users from disabling the token security would be for you to run Jupyterhub. It can be integrated with your organisation's single-sign-on system if you have one, or with e.g. Github accounts using Oauth. If that's the easiest way for users to run a notebook on the cluster, they're unlikely to try to run insecure notebook servers themselves. It would, however, require some setup and maintenance work from you. If you're interested, @minrk can provide info on that.

prakritigupta commented 7 years ago

I found a similar issue. I accidently stumbled upon another user's notebook server and was able to access those files. After reading this post, I immediately upgraded my Anaconda. I can now access the running notebook server using the token generated by jupyter. Running a jupyterhub will be really helpful for users like me, who spend almost two hours trying to figure out about bridging the gap between my local machine to login node to compute node.

Ronaldomata34 commented 5 years ago

I have set up Jupyter on EC2, I can access through a url, but when the process stops on my machine, I do not respond. How do I make Jupyter always running! and can access through an EC2 url