dask / helm-chart

Helm charts for Dask
https://helm.dask.org/
91 stars 91 forks source link

Upstream Pangeo Helm Chart here #68

Closed jacobtomlinson closed 4 years ago

jacobtomlinson commented 4 years ago

The Pangeo community have a Helm Chart which installs both the zero2jupyterhub and dask-gateway Helm Charts as dependencies and configures them to work together.

This enables folks to set up a multi-user deployment of Jupyter and Dask for their group or organisation. It is effectively the next stage on from the single-user experience of the current chart in this repo.

There has been a discussion in that chart around consolidating it into the main Pangeo configuration to remove maintenance burden (pangeo-data/helm-chart#129).

My view is that the Pangeo chart as it stands does not have much, if any, Pangeo specific content in it. It is just a wrapper around Jupyter and Dask gateway's existing charts. Therefore I propose we upstream the chart here so that this repository contains both a single-user and multi-user chart.

Moving it here would remove the maintenance burden from the Pangeo folks and only add a small burden here as we already maintain this chart with automated PRs to update dependencies. It would also expose the chart to a wider audience.

I would be keen to get feedback from both the Dask and Pangeo communities on this proposal.

jhamman commented 4 years ago

Thanks @jacobtomlinson for opening this discussion. Just chiming in to say that I would support this such a move, if it finds support here and that, if a Dask JupyterHub/Gateway chart existed, we would gladly focus our development and maintenance energies here.

jacobtomlinson commented 4 years ago

ping @dask/maintenance

martindurant commented 4 years ago

I am generally in favour, but have to trust your assessment on how much of a maintenance burden this would be - I assume more than updating dask/distributed versions.

jacobtomlinson commented 4 years ago

@martindurant it is effectively a meta chart so will require updating the versions of the two charts it depends on, one of them being Dask Gateway so we maintain that anyway. I'm happy that this can also be automated.

martindurant commented 4 years ago

I am convinced

TomAugspurger commented 4 years ago

I think I'm also onboard with the general idea of us maintaining a multi-user Dask + Jupyter helm chart, powered by Dask-Gateway & Jupyterhub.

The helm chart itself probably needs some updating. Over time it's become fairly opinionated and tailored to pangeo's needs & deployments. For example, I think the tolerations at https://github.com/pangeo-data/helm-chart/blob/b1a230a88d2587713eb440c9de402770f6ae32e6/pangeo/values.yaml#L66-L87 effectively require a kubernetes cluster with node pools that are able to handle those.

scottyhq commented 4 years ago

thanks for opening this @jacobtomlinson. Just want to say i'm in favor of this and would be happy to dedicate some time for generalizing and testing to give the chart a new home! For starters, would this essentially be defined under a subfolder in this repository and installed as helm install --name my-release dask/dask-jupyterhub ?

jhamman commented 4 years ago

@TomAugspurger - small detail but this:

For example, I think the tolerations at https://github.com/pangeo-data/helm-chart/blob/b1a230a88d2587713eb440c9de402770f6ae32e6/pangeo/values.yaml#L66-L87 effectively require a kubernetes cluster with node pools that are able to handle those.

is not quite right. This just set's the worker/scheduler pod tolerations to match those of dask-kubernetes (ref). We may actually want to discuss this as a fixture in dask-gateway but it doesn't require cluster nodes to have a matching taint.

TomAugspurger commented 4 years ago

Thanks for the clarification.

On Mon, Jun 15, 2020 at 10:40 PM Joe Hamman notifications@github.com wrote:

@TomAugspurger https://github.com/TomAugspurger - small detail but this:

For example, I think the tolerations at https://github.com/pangeo-data/helm-chart/blob/b1a230a88d2587713eb440c9de402770f6ae32e6/pangeo/values.yaml#L66-L87 effectively require a kubernetes cluster with node pools that are able to handle those.

is not quite right. This just set's the worker/scheduler pod tolerations to match those of dask-kubernetes (ref https://github.com/dask/dask-kubernetes/blob/4a88095c4698eb1d0bbc93dee003a4a709c90df2/dask_kubernetes/objects.py#L216-L231). We may actually want to discuss this as a fixture in dask-gateway but it doesn't require cluster nodes to have a matching taint.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/helm-chart/issues/68#issuecomment-644512460, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIVXTVB2D5HCZSRJVF3RW3SRHANCNFSM4NYSLUOQ .

TomAugspurger commented 4 years ago

I looked into this a bit yesterday. I think the biggest things to decide on are

  1. How to structure the repository (one chart, with flags to enable / disable jupyterhub / dask-gateway, or two separate charts)
  2. If we have two charts, what do we name them? helm install dask gives you the same thing as today (Dask with dask-kubernetes and a single jupyterlab / notebook server). helm install ... gives you JupyterHub + Dask Gateway for multiple users and clusters. Should we continue to call that "pangeo"? "dask-multiuser"? "daskhub"?
TomAugspurger commented 4 years ago

A couple additional things I'm struggling with, to make a helm install dask-multiuser thing work out of the box.

  1. Assuming we want to piggy back on JupyterHub auth, it's difficult to get this working out of the box. We need to set the value jupyterhub.hub.services.dask-gateway.url to the rendered value of http://traefik-{{ .Release.Name }}-dask-gateway.{{ .Release.Namespace }}. However, helm doesn't evaluate values in the chart's values.yaml. Does anyone know if there's an easy way to set a rendered value in a subchart? Everything I've tried has come up empty.

Alternatively, we don't proxy through a JupyterHub service, and instead expose the gateway publicly? Pangeo might be moving to that to open up some non-JupyterHub-based workflows. Then things are fine.

  1. (Less critical than 1): it'd be nice to configure the singleuser environment to have the right address & proxy address. I've had trouble with that for the same reason, they depend on values that I don't think are known the first time we do a helm install.
jacobtomlinson commented 4 years ago

My preference would be to not call it pangeo. One of the reasons we are moving it here is because it's pretty generic and will be useful to other communities. I like dask-multiuser and daskhub or dask-hub.

I would aim for one chart with configurable options for gateway or no gateway.

I think one way to do things would be to create a config map which gets rendered with the same template and then mount the config map as environment variables in order to configure things. Happy to help/pair on it.

TomAugspurger commented 4 years ago

create a config map which gets rendered with the same template and then mount the config map as environment variables in order to configure things.

I think I tried this earlier and failed. The rendering does work, but then I had trouble with mounting the configmap as a volume. I'll give it one more shot today and reach out if I have issues.

jacobtomlinson commented 4 years ago

You may have a better experience mounting the config options as environment variables rather than a volume.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#define-container-environment-variables-using-configmap-data

TomAugspurger commented 4 years ago

Indeed, I was just looking at that page and this looks much better. Trying it out now. JupyterHub didn't seem to like my first attempt:

      File "/usr/local/lib/python3.6/dist-packages/traitlets/traitlets.py", line 1524, in error
        raise TraitError(e)
    traitlets.traitlets.TraitError: The 'environment' trait of a PatchedKubeSpawner instance must be a dict, but a value of class 'list' (i.e. [{'name': 'DASK_GATEWAY__AUTH__TYPE', 'valueFrom': {'configMapKeyRef': {'key': 'DASK_GATEWAY__AUTH__TYPE', 'name': 'daskhub-config'}}}, {'name': 'DASK_GATEWAY__ADDRESS', 'valueFrom': {'configMapKeyRef': {'key': 'DASK_GATEWAY__ADDRESS', 'name': 'daskhub-config'}}}]) was specified.
jupyterhub:
  singleuser:
    extraEnv:
      - name: DASK_GATEWAY__AUTH__TYPE
        valueFrom:
          configMapKeyRef:
            name: daskhub-config
            key: DASK_GATEWAY__AUTH__TYPE
      - name: DASK_GATEWAY__ADDRESS
        valueFrom:
          configMapKeyRef:
            name: daskhub-config
            key: DASK_GATEWAY__ADDRESS
jacobtomlinson commented 4 years ago

Hmm that may be a limitation of KubeSpawner.

TomAugspurger commented 4 years ago

hub.extraEnv does note that it can take a list of EnvVar objects https://zero-to-jupyterhub.readthedocs.io/en/latest/reference/reference.html#hub-extraenv. I think that's at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/08c13609c1d0c6cb07d45d49d0a876100cf941eb/jupyterhub/templates/hub/deployment.yaml#L208-L210.

singleuser.extraEnv doesn't have any similar note, and I can't find where it would be handled.

TomAugspurger commented 4 years ago

@consideRatio does https://github.com/dask/helm-chart/issues/68#issuecomment-674861377 sound right to you? That there isn't a simple way to pass singleuser environment variables as a list of EnvVar objects?

I think one possible solution would be for jupyterhub to run the user-provided extraEnv through the tpl function to render templates, as discussed in https://hackernoon.com/the-art-of-the-helm-chart-patterns-from-the-official-kubernetes-charts-8a7cafa86d12. I think that would allow us to provide a default values.yaml like

jupyterhub:
  singleuser:
    extraEnv:
      DASK_GATEWAY__ADDRESS: "traefik-{{ .Release.Name }}-dask-gateway"

That would ensure that the singleuser pod gets the rendered form, rather than the template placeholders.

consideRatio commented 4 years ago

@TomAugspurger ah, I see nice that it was possible to use at all! There is an issue with using lists as they will get overridden rather then merged when you have two different configuration files specifying hub.extraEnv. And, since this is a Helm chart to be distributed for use by others, its very likely that users will end up thinking that they can configure hub.extraEnv themselves, but end up overriding your configuration.

I suggest making a inline comment of that issue for now, and awaiting an improvement in z2jh that allows hub.extraEnv to be a dict that can hold k8s EnvVar objects.

TomAugspurger commented 4 years ago

Ah, I didn't realize that it would be overwritten, but that does make sense.

I'll give an alternative method one more shot, and if that doesn't work then I'll add notes saying how to add additional hub.extraEnv values.

consideRatio commented 4 years ago

I read through the article and considered various options, and I think the way to go for a chart to be distributed for other users etc is to rely on a extraSomething being an object which keys represents the extraSomething elements to be added. While z2jh supports this, it assumes that you pass a string as a value rather than an entire EnvVar object which should also be made supported.

To conclude, fixing https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/1103 is what I would recommend. Hmm.. I'll go ahead and spend time doing that now.

TomAugspurger commented 4 years ago

Thanks for looking into https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/1103. I do think fixing that would generally be useful, but we may have a workaround in https://github.com/dask/helm-chart/pull/94 in this case :)

jhamman commented 4 years ago

Thanks @TomAugspurger for pushing this forward!