twiecki / pydata_docker_jupyterhub

Docker container with a PyData stack and JupyterHub server
https://registry.hub.docker.com/u/twiecki/pydata-docker-jupyterhub/
Apache License 2.0
37 stars 18 forks source link

data persistence and permissions issue #2

Open fdeheeger opened 9 years ago

fdeheeger commented 9 years ago

Thanks a lot for sharing this setup, nice thing to play with to start the new year.

I have permission issues by using the container with multiple users: I imagine that the shared_nbs folder is here to share notebooks between users, but once i create one with a user A, i cannot even read it with a user B. Another issue comes with the fact that the first user creates the .ipynb_checkpoints folder in his name, and so B can create a new notebook in that folder but he can't save it.

You may have solved that issue in a different way, by setting up a data volume or a data-only container... and here is the second part of the question, how do you deal with data persistence ?

twiecki commented 9 years ago

Thanks for checking it out!

The permissions thing is an oversight. We should change it to world readable and maybe change the umask of each user.

Persistence is tricky. I have a solution to give users access to github from inside the NB so users just commit and push to get persistence that way. But that's not idea for every case. Would love ideas on this.

fdeheeger commented 9 years ago

Just to inform you about my (still not working so far) investigations (sorry if you already know those docker tricks). My idea is to be able to have persistence on /opt/shared_nbs (at least... once this is working, i may want to store users folder too !)

To use a persistent shared_nbs folder, i found 2 ways (that ended to be very similar)

The difficulty stands in the permission management...

My first idea was to create a jupyter group (as you can see in the Dockerfile), affect new user to that group (in the add_user script), and use the chmod g+s to force that jupyter group in /opt/shared_nbs for every created file. It seems that the +s option doesn't propagate to the container (so actually if you want to try, you don't need all those linse from the Dockerfile !). I looked to ACL specification, but with not much success (for now!)

The other way is to do what you proposed: open the folder to the world, and change the umask of users. But i was not able to find what affects the default umask. (we cannot write it in .bashrc as it is not loaded, neither on .profile, changing /etc/login.defs didn't help, etc). The problem may be in the container config, or in the way that environment informations are sent to the single-user jupyter (or something i did wrong.)

ariddell commented 9 years ago

I think the permissions problem might be solved if you use the same image for the data container as described in: http://docs.docker.com/userguide/dockervolumes/#volume-def

fdeheeger commented 9 years ago

I did not try yet, but you are right, it might help keeping the right uid/gid between both containers. But if you look at the first comment of this issue, i had permissions problems even inside the single container (ie. without trying to create persistence.) Any idea ?

twiecki commented 9 years ago

@fdeheeger I think the VOLUME is overwriting the directory and its permissions. Can you try flipping the order?

I like the group permission idea. We can alter the add_user.sh script to assign new users to that group.