2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
103 stars 62 forks source link

The configurator can over-ride hub configuration in confusing ways #1015

Open choldgraf opened 2 years ago

choldgraf commented 2 years ago

Background and proposal

Background Right now we configure hubs via Z2JH yaml files, this is generally the source of truth for the current config state of a hub.

However, we also expose some configuration options via the hub configurator. This is useful to give hub administrators quick UI access to common options, rather than asking them to go via the standard PR-based route. This is particularly useful for updating Docker images for user environments.

Problem

However, the configurator also introduces a point of confusion in two ways:

Example

In the Pangeo case above, looking at their hub configuration makes it seem that they haven't updated their user image since mid last year:

https://github.com/2i2c-org/infrastructure/blob/e0cc30842a4f9f00c32de80eecf78e0ef6497719/config/clusters/pangeo-hubs/staging.values.yaml#L61-L64

However, their configurator page shows it is actually running the October 2021 image:

image

This means that if there are problems with the Pangeo environment (as there recently were, as reported by @rabernat), then an engineer might be mis-led if they don't know to look for the configurator UI.

Solutions?

I am not sure what's the right way to address this, and would love feedback and ideas. A few random ideas that come to mind:

rabernat commented 2 years ago

Thanks for raising an important issue.

In order to avoid premature over-generalization, it's useful to recognize the problem in question was related to dask gateway. Dask gateway is somewhat unique in that it requires some synchronization between the client environment (possibly specified by the community representative) and the dask gateway server (specified only by 2i2c admins). I presume the problem here was that the client version was incompatible with the server version. There may be other use cases that require this type of coupling, but I can't think of any.

In general, as a "community representative", I would also be really happy to have a single source of truth for the hub image: the yaml file in github. That is what I was used to before with the Pangeo hubs. I only started using the configurator because that's what the 2i2c docs recommended. In our old Pangeo hub, I think we had automated testing before deployment to ensure dask gateway worked with each image update. I would gladly sacrifice the convenience of the configurator for more robust testing and validation of the configuration.

More generally, I would love to see us moving towards more flexible and configurable user environments. I am still hoping to one day see a drop down menu of all the different different tags from pangeo docker images in my 2i2c hub. 😍

choldgraf commented 2 years ago

@rabernat one challenge with this is that we're centralizing the configuration for many communities in one repository, so we cannot just give merge rights to community representatives or else they'd be able to alter the configuration of other communities as well :-/ so the challenge is that we could ask people to update a YAML file in a particular repository, but they would need to wait for a 2i2c admin to press the merge button...maybe that would still be an OK workflow?

sgibson91 commented 2 years ago

I did develop a GitHub Action that would open pull requests to update the values of the single-user image and also any images defined in profileList (something the configurator can't yet do!) https://github.com/sgibson91/bump-jhub-image-action

I don't think it's exactly what we want, but I think a lot of the code could be reused to get us partway towards this vision - at least to the point where the yaml is the single source of truth. At the minute, it also only works with Docker Hub though I've been meaning to add a function so it works with quay.io too

abkfenris commented 2 years ago

A variant of that line of thinking which I've seen in other projects, is a bot that will merge for users that don't otherwise have permissions. Another would be to have configurator be a PR generator & merger.

Also, could CODEOWNERS be used to limit the scope of the repo that community representatives can merge to?

choldgraf commented 2 years ago

Another confusing part of the configurator that I just discovered: if you modify the image on a production hub via the configurator, then this does nothing to the staging hub. So you end up with a situation where prod and staging are now divergent in implicit ways.

I think that if there were a bot that auto-opened PRs to make sure that changes were consistent, that would help a lot.

rabernat commented 2 years ago

Reading through this issue again carefully, I can't help but feel that the solution is what I described in https://discourse.pangeo.io/t/future-of-pangeo-cloud-i-binder-for-everything/1574

If we were to convert all our hubs into binderhubs, we would vastly simplify the job of the hub admins and remove the need for this redundancy in specifying the image. I feel like everything else will be an unsatisfactory workaround.

An alternative approach would be to use conda store to factor the environments out of the base hub configuration.

damianavila commented 2 years ago

I think there is an inherent tension here coming for the fact we are empowering the users, through the configurator, to modify the config in the UI. If we want to keep that versatility without being involved in that loop (and without going full on the idea @rabernat posted above which is quite exciting, as I have said before in other places), we will definitely need something on our hubs retrieving the "composited" information and showing it somehow. I guess the configurator could somehow offer a REST API so we can retrieve/reconstruct that info... @consideRatio was talking about a REST API to do other interesting stuff here: https://github.com/yuvipanda/jupyterhub-configurator/issues/10.