Closed mrocklin closed 5 years ago
What if we were able to use the same docker image for both notebook and worker? This would simplify several things. The images are already similar in size. Presumably there would just need to be some environment variable to tell the pod whether to run the notebook startup script or the worker startup script.
I spent a long while trying to get this going on Azure, because "Azure File", which is SMB, allows ReadWriteMany access. I pretty much had it working, but because Azure File is SMB it doesn't do Unix permissions properly, and I gave up. And as far as I can tell, it is not possible to easily mount a drive with one pod as read-write, while allowing other pods read-only access. I believe it can likely be done using an NFS volume, but from what I can tell this might be fairly complicated. @yuvipanda is the expert's expert on this, and has many issues/threads discussing the use of NFS persistent volumes. If you want to have the workers share the same home directory as the notebook, I think getting an NFS solution figured out is probably the best route.
I support moving to a system where the notebook and worker use the same image.
To be clear using the same image only gets us a little bit of the way there. Their environment will be the same when they first start their first session. Any change they make will not be covered. In this issue I'm actually suggesting that, if we can get them to share a file system then we remove the conda environment from the docker image completely. I want us to get out of the game of determining what versions of software users run.
Is'nt the Docker volume mechanism a good solution for this? see https://docs.docker.com/storage/volumes/#share-data-among-machines.
I don't know if there is Google Cloud volume driver though, even if they are talkin about AWS and Azure in the doc.
We currently use the same image for the notebooks and workers.
Allowing access to the users home directories from the workers is on our roadmap, however our current volume type can only be mounted on one host so we need to switch to a different type and migrate existing home directories first.
We currently use the same image for the notebooks and workers.
Allowing access to the users home directories from the workers is on our roadmap, however our current volume type can only be mounted on one host so we need to switch to a different type and migrate existing home directories first.
I would like to move forward on this issue. @jacobtomlinson: can you share the Dockerfiles and related scripts which allow you to use a single image for both notebook and worker. Currently we have the following:
We are currently using this image https://github.com/informatics-lab/singleuser-notebook for both.
It is a very bloated image (3GB compressed and 10GB unpacked) but that isn't because we are using it for both, it's because we have lots of extra stuff in it. A task on my todo list is to take the pangeo notebook image and add our extra stuff to it.
Looking at the two dockerfiles and preparation scripts I can't actually see that many differences. They are based on different images, but I imagine the notebook one is based on the miniconda one upstream so it probably just has some extra conda packages to add notebooks.
Each Dockerfile one has an apt-get
bit, a conda
bit and a pip
bit to install packages, these could be merged.
The notebook one then has some extra steps for populating the home directory with the example notebooks. This could be made optional via an environment variable, e.g for the workers you set PANGEO_UPDATE_EXAMPLE_NOTEBOOKS=False
in the dask-kubernetes worker template and add a check to the prepare script to skip over those steps if it is false.
I think that our notebook image is based off of jupyter's base-notebook image
On Thu, Jul 19, 2018 at 11:26 AM, Jacob Tomlinson notifications@github.com wrote:
We are currently using this image https://github.com/ informatics-lab/singleuser-notebook for both.
It is a very bloated image (4GB compressed and 10GB unpacked) but that isn't because we are using it for both, it's because we have lots of extra stuff in it. A task on my todo list is to take the pangeo notebook image and add our extra stuff to it.
Looking at the two dockerfiles and preparation scripts I can't actually see that many differences. They are based on different images, but I imagine the notebook one is based on the miniconda one upstream so it probably just has some extra conda packages to add notebooks.
Each Dockerfile one has an apt-get bit, a conda bit and a pip bit to install packages, these could be merged.
The notebook one then has some extra steps for populating the home directory with the example notebooks. This could be made optional via an environment variable, e.g for the workers you set PANGEO_UPDATEEXAMPLE NOTEBOOKS=False in the dask-kubernetes worker template and add a check to the prepare script to skip over those steps if it is false.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/pangeo/issues/272#issuecomment-406316516, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszPMtnwWw62YRg6AuK4Ow5KUMJCtkks5uIKUQgaJpZM4UOSYJ .
Yeah looks like it. Ours is based of the scipy one, which I think is based off the base-notebook and adds scipy and some other packages.
I assumed that all the jupyter images eventually tree up to miniconda, but it turns out the are starting from ubuntu and installing miniconda (which is basically what the miniconda image does anyway).
https://github.com/jupyter/docker-stacks/blob/master/base-notebook/Dockerfile
Either way I would just add any packages to the notebook image which are in the worker image and then try using the notebook image in place of the worker image. I expect it should Just Work™.
It will cause the example notebooks to be cloned onto all the workers at run time. But that can be removed later.
There are two ways to do this in the long run:
I think (1) is the easier option. AWS has EFS, Google has Cloud Filestore.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.
For those that come along this issue later, we are currently using the same docker image for workers and jupyter pods BUT they are NOT on a shared file system. Instead, the notebook image just tells dask-kubernetes to use the same image. This solves the problems related to maintaining two docker images but does not solve the problem of user updates to the runtime environment.
A year and a half into the future, how are guys doing it now? Do you have new insight into the runtime environment problem?
I think the user-environment issue is largely unchanged.
For binder, we have a pretty decent setup: There's a draft blogpost on this (background and our solution) at https://docs.google.com/document/d/14m-TNi2R4VaTI0g2vy15LBRGDkur2B21wiAdrTt6nBg/edit?usp=sharing that will be published on the pangeo blog once we find time to finish it off.
That is really helpful, thanks
Many of the challenges we have today are about how best to update user software environments, support multiple software environments, etc..
On various video calls we have also discussed removing this concern by giving user's more control over their environment, which would offload the burden from core maintainers.
There are a few challenges to this:
cc @yuvipanda and @jacobtomlinson about point 1