pangeo-data / helm-chart

Pangeo helm charts
https://pangeo-data.github.io/helm-chart/
21 stars 26 forks source link

Archive and stop updating pangeo helm chart? #129

Closed scottyhq closed 3 years ago

scottyhq commented 4 years ago

In the interest of consolidating pangeo infrastructure and configuration, should we archive this repo and update pangeo-cloud-federation to no longer use the pangeo-helm chart? This question has come up before on calls, I haven't looked through all existing issues, but if I understand correctly if we fully embrace dask-gateway and drop dask-kubernetes I think we can just rely on upstream helm charts.

Pangeo-binder does not require the pangeo helm chart. (https://github.com/pangeo-data/pangeo-binder/tree/staging/pangeo-binder). I've always found it confusing that default config for the persistent hubs is in two places (https://github.com/pangeo-data/helm-chart/blob/master/pangeo/values.yaml) and (https://github.com/pangeo-data/pangeo-cloud-federation/blob/staging/pangeo-deploy/values.yaml).

Pros:

Cons:

@jhamman, @TomAugspurger, @rabernat, @tjcrone, @consideRatio, @yuvipanda, please chime in in case I'm overlooking something related to the helm chart history or configuration needs.

TomAugspurger commented 4 years ago

It'd be good to hear from other groups deploying pangeo on kubernetes. I think there's some value in having a helm chart that combines these two for you, for the simple case of "I want jupyterhub & dask".

That said, I wonder if all of the config settings in pangeo-deploy/values.yaml could be moved to this helm chart? If those are settings that we think are appropriate for all our hubs, then perhaps we can just set them here?

jhamman commented 4 years ago

This has come up in one form or another a few times in the past. We've generally landed on there being both social and technical benefit for keeping this around. The social benefit comes from being able to say, "I installed pangeo" (i.e. helm install pangeo). Having a thing, thin as it may be at the moment is something that we can market, and build on. The technical benefit is that we can, overtime, build on the integration of dask-gateway and zero-to-jupyterhub.

jacobtomlinson commented 4 years ago

I agree with @jhamman about the social benefit of "I installed Pangeo". Although Pangeo often feels more similar to something like the LAMP stack, where it isn't an actual tool, but rather a collection of tools which together allow you to meet a certain goal.

This helm chart has always been a pretty thin layer on top of the Zero 2 JupyterHub chart. To continue the LAMP example this is similar to the lamp-server^ metapackage that you can install on Ubuntu.

It's also interesting to compare this with the Dask chart. That chart gives you a single notebook server, Dask scheduler and some Dask workers which you can scale with a k8s deployment. It is aimed at a single user with access to a k8s cluster who wants to install a Jupyter/Dask environment for themselves.

This chart gives you JupyterHub with Dask Gateway. Which feels like a more grown up version of the Dask chart and is aimed at organisations and admins who want to provide a Jupyter/Dask environment for others.

I wonder whether we should provide a JupyterHub/DaskGateway meta chart in the Dask Helm repo. That part of Pangeo is pretty generic and would benefit other communities too.

Then the technical secret sauce of Pangeo becomes the conda environment stacks, which are actually the geoscience specific part.

TomAugspurger commented 3 years ago

https://github.com/dask/helm-chart/tree/master/daskhub is available now, so we'll need to decide what to do here. I think deprecating the chart is probably best. We'll point most users to dask/daskhub.