RhodiumGroup / docker_images

Docker images for Rhodium's jupyterlab deployments
2 stars 2 forks source link

more deployment updates #108

Closed bolliger32 closed 4 years ago

bolliger32 commented 4 years ago

Workflow

Summary

Trying to pick up where @delgadom left off in #102 , and also integrate a few things from #91.

In #102, we currently had the following problems. The check ones should be fixed from this PR. The unchecked ones are not, b/c I'm not quite sure how to do them and/or I'm not sure if we need to do them before these changes get merged and deployed to compute.rhg:

Supersedes #101 (where we just changed the name of the google credentials environment variable). Closes #95

Features

In general, I did a few things:

  1. Rearrange the structure of the build process so that we remove some duplication in the notebook and worker dockerfiles
  2. Bypass the intermediate netcdf images for both notebook and worker (as in #91)
  3. Switch to using the same Ubuntu base image for worker and notebook (as in #91, not sure this is 100% necessary but back when it was giving us slightly different netcdf versions b/c the repos for the different distributions was different)
  4. Unpin every python package, let conda figure out the best spec, and then re-pin everything
  5. Dropped the worker env on workers. Was there a reason we were doing that? It was just frequently coming up as a hassle, but not an insurmountable one if there was a reason we wanted a worker environment that we were always inside of, rather than just using the base env
bolliger32 commented 4 years ago

Right now, we're not initializing gcloud so that users can perform gsutil commands. We could add a simple line to the notebook/prepare.sh like we do for worker/prepare.sh but I didn't know if we actually want to do that or not, so I left it out for now. I also think we might be able to drop some of the optional environment handling in notebook/prepare.sh? Like, are we going to pass an environment.yml file to the notebook somehow? and/or are we going to pass extra_conda_packages or extra_pip_packages? If not, we could just drop that stuff for clarity. But it's not doing any harm staying in there.

delgadom commented 4 years ago

Back from the honeymoon. This looks awesome! Thanks so much.

Re your last comment - I think we can get rid of all of this in prepare.sh - I don't think there's any way that these would be used so delete away! Also, I'm a fan of not initializing gcloud in prepare.sh, since the user can set up different auth and store it in the home directory (e.g. with gcloud init) and we wouldn't want to manually override that with rhg-data credentials or something.

The helm chart milestones all pertain to simultaneous development tasks in the helm-chart repo, so aren't necessary for this PR - we just need to keep an eye on them.

Have you been able to deploy this? Which server? Excited to give this a spin :)

bolliger32 commented 4 years ago

@delgadom right on! Hope you got a good tan :) this should be up on both compute-test and test2. Give it a whirl! See if you can break something :)

bolliger32 commented 4 years ago

FYI - looks like geopandas is abandoning their cython branch in favor of integrating a new package called pygeos (geopandas/geopandas#1155). But not quite ready yet so will stick with the dev version for now