dask / dask-kubernetes

Native Kubernetes integration for Dask
https://kubernetes.dask.org
BSD 3-Clause "New" or "Revised" License
312 stars 148 forks source link

Cannot Overwrite DASK_SCHEDULER_ADDRESS in Worker env #873

Open efeboyaci opened 8 months ago

efeboyaci commented 8 months ago

Describe the issue: To use a different port or non default FQDN for the scheduler we provide DASK_SCHEDULER_ADDRESS env in daskcluster .spec.worker.spec.containers[0].env, however the operator creates 2 items in worker deployment env. The last one overrides the first in the pod's env. So we cannot change DASK_SCHEDULER_ADDRESS environment variable from dask kubernetes objects.

Minimal Complete Verifiable Example:

from dask_kubernetes.operator import KubeCluster
cluster = KubeCluster(name="test", n_workers=1, namespace="dask-operator", env={"DASK_SCHEDULER_ADDRESS": "tcp://test-scheduler:8786"})

DaskWorkerGroup

env:
- name: DASK_SCHEDULER_ADDRESS
  value: tcp://dev-cluster-scheduler:8786

Worker Pod

env:
- name: DASK_SCHEDULER_ADDRESS
  value: tcp://dev-cluster-scheduler:8786
- name: DASK_WORKER_NAME
  value: dev-cluster-default-worker-5f4e5766de
- name: DASK_SCHEDULER_ADDRESS
  value: tcp://dev-cluster-scheduler.dask-operator.svc.cluster.local:8786

Anything else we need to know?:

Duplicate items with same names in container["env"] and env should be removed from env before adding it to configuration Current: https://github.com/dask/dask-kubernetes/blob/b668cc62944dc3eac225dc4c4ff01f2b3743c9da/dask_kubernetes/operator/controller/controller.py#L166-L170 https://github.com/dask/dask-kubernetes/blob/b668cc62944dc3eac225dc4c4ff01f2b3743c9da/dask_kubernetes/operator/controller/controller.py#L202-L206

Fix:

    for container in deployment_spec["spec"]["template"]["spec"]["containers"]:
        if "env" in container:
            # Remove duplicate env vars
            container_env_names = [env_item["name"] for env_item in container["env"]]
            for env_item in env:
                if env_item["name"] in container_env_names:
                    env.remove(env_item)
            # Add the env vars
            container["env"].extend(env)
        else:
            container["env"] = env

Environment: 2024.3.0 dask operator

jacobtomlinson commented 8 months ago

@jonded94 given you recently fix a related problem in #869 do you have any interest into looking into what is going on here?