jupyter / docker-stacks

Ready-to-run Docker images containing Jupyter applications
https://jupyter-docker-stacks.readthedocs.io
Other
8.03k stars 2.99k forks source link

Multi users #1897

Closed lelejill closed 1 year ago

lelejill commented 1 year ago

What docker image(s) is this feature applicable to?

all-spark-notebook, datascience-notebook

What changes are you proposing?

I am running the data science notebook image to host a Jupyterlab server for multiple users. Say user Adam is working on a notebook A, meanwhile when user Bob starts visiting the jupyterlab, he will see the Bob's notebook A (and Bob's session)... Is there a way that each user only see their own session and notebook when access the juptyerlab? or each user all only see/access their own user directory

Please feel free to change the label if it's not a bug, rather a feature instead. Thanks!

How does this affect the user?

User Bob can accidently delete/update User Adam's notebook he is working on.

Anything else?

No response

mathbunnyru commented 1 year ago

I think it depends on how your spawner works. For example, if you use DockerSpawner, it will spawn single-user Jupyter Notebook servers in separate Docker containers on the same host and each one of them will have a notebook directory as a Docker volume on the host.

You can take a look at the simple example here: https://github.com/jupyterhub/jupyterhub-deploy-docker

So, it's definitely possible to not share data between users.

lelejill commented 1 year ago

Thanks for the feedback.

I actually deployed the docker image on Kubernetes cluster. I wonder how it would apply in K8s env

How do I specify the Notebook server image to spawn for users? In this deployment, JupyterHub uses DockerSpawner to spawn single-user Notebook servers. You set the desired Notebook server image in a DOCKER_NOTEBOOK_IMAGE environment variable.

JupyterHub reads the Notebook image name from jupyterhub_config.py, which reads the Notebook image name from the DOCKER_NOTEBOOK_IMAGE environment variable:

# DockerSpawner setting in jupyterhub_config.py
c.DockerSpawner.image = os.environ['DOCKER_NOTEBOOK_IMAGE
mathbunnyru commented 1 year ago

I don't think DOCKER_NOTEBOOK_IMAGE is the thing you want. DOCKER_NOTEBOOK_IMAGE is used for specifying an image - most of the time users will use the same image, for example jupyter/datascience-notebook. At the same time, each user will run their own container.

What you're talking about is sharing the data. I think you should take a look at volumes property of DockerSpawner instead: https://jupyterhub-dockerspawner.readthedocs.io/en/latest/data-persistence.html

Note: I'm not an expert in DockerSpawner.

mathbunnyru commented 1 year ago

I also suppose, by default DockerSpawner doesn't share data for different users and the data is not persistent at all. So I advice you to start with simple DockerSpawner configuration and then figure out how to make data persistent.

mathbunnyru commented 1 year ago

Also, if you're mounting /home/jovyan or /home/jovyan/work from some shared location in k8s, this is the problem.

lelejill commented 1 year ago

Also, if you're mounting /home/jovyan or /home/jovyan/work from some shared location in k8s, this is the problem.

Exactly! Following setting might be useful so that each user has it's own directory?

Mount the real user's Docker volume on the host to the notebook user's notebook directory in the container c.DockerSpawner.volumes = { 'jupyterhub-user-{username}': notebook_dir }

mathbunnyru commented 1 year ago

Yes, I think this is exactly the setting you need.

mathbunnyru commented 1 year ago

@lelejill were you able to solve your problem?

mathbunnyru commented 1 year ago

I think this is solved.