Helm chart: allow JUPYTERHUB_API_KEY to be mounted from specified k8s secret and key

consideRatio commented 2 years ago

Feature proposal summarized

I propose allowing an environment variable for the gateway pod to be set by not passing an API key directly, but by passing a name for a k8s Secret resource and a key to read a value from within it. This is helpful as the JupyterHub helm chart for example supports automatic generation of JupyterHub API keys to be stored in a k8s Secret - so then one could simply reference that without asking the helm chart user to manage the sensitive API key at any point in time.

Practically, something like this is what is proposed: allowing an optional configuration to be used.

    {{- if (eq .Values.gateway.auth.type "jupyterhub") }} 
    - name: JUPYTERHUB_API_TOKEN 
      valueFrom: 
        secretKeyRef: 
+         name: {{ .Values.gateway.auth.jupyterhub.apiTokenFromSecretName | default (include "dask-gateway.apiName" .) }} 
+         key: {{ .Values.gateway.auth.jupyterhub.apiTokenFromSecretKey | default "jupyterhub-api-token" }} 
-         name: {{ include "dask-gateway.apiName" . }} 
-         key: jupyterhub-api-token 
    {{- end }}

A technical background

jupyterhub_api_token and jupyterhub_api_url is configuration for the JupyterHubAuthenticator class. The dask-gateway Helm chart sets the JUPYTERHUB_API_TOKEN environment variable via a k8s Secret that in turn is passed a value from helm chart config.

In Helm chart's like daskhub that deploys a jupyterhub and dask-gateway next to each other in the same k8s namespace, you are supposed to declare an api token in the dask-gateway helm chart config, and the jupyterhub helm chart config like this:

# a values.yaml file passed when using daskhub that
# depends on the jupyterhub and dask-gateway helm chart
jupyterhub:
  hub:
    services:
      dask-gateway:
        apiToken: "<secret-token>"

dask-gateway:
  gateway:
    auth:
      jupyterhub:
        apiToken: "<secret-token>"

The JupyterHub Helm chart can now automatically generate an API token for any services under hub.services and store it in a k8s Secret. If the dask-gateway helm chart could be made to reference that k8s Secret, it could mount it from there directly instead of first creating its own k8s Secret with passed api token, and then mounting it.

Currently though, there is no mechanism to provide a custom environment variable to the dask-gateway-server pod, or configure it to mount the JUPYTERHUB_API_TOKEN environment variable from another k8s Secret.

https://github.com/dask/dask-gateway/blob/bee9255e5ea0d77f456985cd91b2622bb3776dbb/resources/helm/dask-gateway/templates/gateway/deployment.yaml#L56-L63

I think a clean solution would be to allow this environment variable to be mounted from a manually specified k8s secret and key. That way, the daskhub helm chart for example, could simply specify this instead...

# possible future default values for a future version of daskhub that
# depends on the jupyterhub and dask-gateway helm chart, where
# the dask-gateway helm chart allows configuration of what k8s Secret
# and key it use when mounting a JUPYTERHUB_API_TOKEN
# environment variable to the dask-gateway-server pod
#
# like this, users of the helm chart won't need to generate or maintain
# a need-to-keep secret jupyterhub api token
jupyterhub:
  hub:
    services:
      # By having an entry for dask-gateway, the z2jh helm chart will
      # generate a jupyterhub api token and persist it in the hub k8s Secret
      dask-gateway: {}

dask-gateway:
  gateway:
    auth:
      jupyterhub:
        apiTokenFromSecretName: hub
        apiTokenFromSecretKey: hub.services.dask-gateway.apiToken

Implementation idea

Allow a k8s Secret name and key be configurable to mount the JupyterHub API token from a custom k8s Secret with a given key.
Make the creation of the dask-gateway managed k8s Secret be conditional of not providing a custom k8s Secret name / key.
Update the values schema with these settings

jcrist commented 2 years ago

Thanks for opening this @consideRatio! That all makes sense to me.

martindurant commented 2 years ago

I agree. Do you have a kube dev setup to test on? I've gone through this process recently.

consideRatio commented 2 years ago

@jcrist @martindurant thanks lightning fast feedback :D

I figure I'll test this in production on a smaller research hub, and by doing so, also test if the most recent version of the Helm chart works with freshly built images etc - something I said I'd do a while back but never did =/

martindurant commented 2 years ago

I have a locally built image (same image controller/api and scheduler/workers) that seems to work, I can push it somewhere

consideRatio commented 2 years ago

@martindurant oh thanks - yes please!

martindurant commented 2 years ago

on dockerhub mdurant/gateway:v1.0.0 based on edited version of "example" Dockerfile in this repo. Same image for api/controller and for scheduler/workers.

Note that this uses root user, good enough for testing. Runs in conda base env.

Versions:

python 3.9.5
dask/distr 2022.1.1
msgpack-python 1.0.2
gateway/server local tree (shows up as 0.9.0, but isn't)
tornado 6.1

consideRatio commented 2 years ago

PR opened in #612 with implementation to close this issue!

dask / dask-gateway