2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
104 stars 64 forks source link

Split the Victor image and add corresponding profiles #2948

Open damianavila opened 1 year ago

damianavila commented 1 year ago

Context

Ref: https://github.com/2i2c-org/infrastructure/issues/2862#issuecomment-1668753442

Proposal

We decided to take the following actions:

  • Split the image into two - one for python stuff (maybe based on pangeo docker image), and one for linux desktop stuff.
  • Offer profile options for both of these, so end users can pick and choose one when they start the project

Updates and actions

No response

volcanocyber commented 1 year ago

Hi,

I want to start moving forward with creating multiple images (3 or 4 options). If there's anything I can do outside of making a few repo2docker repos (or other methods) to get this moving, please let me know.

damianavila commented 1 year ago

@jmunroe, can you help @volcanocyber with some recommendations, thanks!

SamKrasnoff commented 1 year ago

@damianavila / @jmunroe Any recommendations on where to start? Happy to take a non repo2docker approach if it makes things easier.

yuvipanda commented 1 year ago

Hey @SamKrasnoff and @volcanocyber!

What I'd love to do is to minimize the number of images being maintained! We've been working on an upstream image with qgis setup (https://github.com/jupyterhub/jupyter-remote-desktop-proxy/pull/51). So what I'd love for us to start with is to add a profile option that's using the upstream qgis image (available in https://quay.io/repository/jupyter-remote-desktop-proxy/qgis) and see if that's enough for the qgis stuff. If that is, we no longer have to deal with linux desktop in the existing image!

Then the next step would be to see if any of the pangeo docker-stacks images are 'good enough' as they are - particularly the pangeo-notebook image. You can look at the list of packages in it at https://github.com/pangeo-data/pangeo-docker-images/blob/master/pangeo-notebook/packages.txt. If there are more packages needed, then I'd suggest we can move the existing image (https://github.com/volcanocyber/VICTOR-notebook) to inherit from the pangeo-notebook image, and just add on whatever packages are needed.

So this way, the profiles would have two options:

  1. An upstream qgis image
  2. Either the upstream pangeo image, or an image that's inheriting from that.

How does that sound?

volcanocyber commented 1 year ago

Generally, that sounds good @yuvipanda . I would also like to add paraview to the remote desktop proxy, as I believe the error was not with the graphics rendering library as I had initially said but with versioning problems. Is there a way to add Paraview (the conda package should work) to an inherited qgis image? Also, should I fork the Pangeo image and make changes so we can test on staging?

Final thought. Is it possible to inherit from the ML-notebook format (https://github.com/pangeo-data/pangeo-docker-images/tree/master/ml-notebook) as one of the versions, but remove the CUDA/nvidia since we aren't utilizing GPUs at the time?

volcanocyber commented 1 year ago

@damianavila & @yuvipanda, We seem to have a good path forward, but I'd like to clarify the above. Having Paraview + QGIS (in the desktop) and an ML related notebook (inherited from Pangeo with some adjustments), along with a default image (likely inheriting from Pangeo base as well) is the ideal scenario, as it allows for compartmentalization without making any one image too large. Should I begin by just forking the Pangeo images I specified?

yuvipanda commented 1 year ago

@volcanocyber sorry, this seems to have fallen through the cracks a little bit.

Yes I think you should inherit from the pangeo image and go from there. I don't think you should fork the repo, as that makes it extremely difficult to maintain long term. I would suggest just starting from https://github.com/2i2c-org/hub-user-image-template/, but using a Dockerfile that inherits from the image and makes your added changes. Since removing packages in an inherited image doesn't actually reduce image size, I'd suggest just letting them be.

Same for qgis, although I think there is going to be probably changes in where that image lives and how it is maintained in the near future. Once you start on those, can you provide a link to the repos you are using to build the images here so we don't lose track?

Thanks.

volcanocyber commented 1 year ago

@yuvipanda Ok, good to know. I think it would be best to make the pangeo-based image as a branch of the victor-notebook repo on this account. Would copying all the files from here and then adding apt/conda packages as needed be a valid option? For QGIS/Paraview/other desktop apps, is there any issue/repo I should be looking at to see how that is progressing?

Thanks!

yuvipanda commented 1 year ago

@volcanocyber I think you should instead create a new repo (avoids long lived branches!), and also use the dockerfile inheritance mechanism. https://github.com/yuvipanda/example-inherit-from-community-image is an example - it inherits from jupyter/scipy-notebook but can be used just as well with pangeo base images as well.

I gave a talk about this at JupyterCon too! https://docs.google.com/presentation/d/16V6ylmirxxBTlVq-tZ9dsFYCf1Abbl83Rv350Rc-DL4/edit#slide=id.p