I'm using Dask to coordinate some ML jobs, which have internal parallelism. I'd like to restrict my cluster so that only one job can run on a node at a time, even though nodes have multiple CPUs. However, this is currently impossible, since the Kubernetes worker resource limit is used to customize --nthreads:
https://github.com/dask/helm-chart/blob/master/dask/templates/dask-worker-deployment.yaml#L37-L38
This also prevents giving nodes e.g. 0.5 or 1.5 cpus, because even though Kubernetes allows it Dask will get confused and crash (i.e. --nthreads fails to parse as an integer.)
This is moderately esoteric so if I should just fork the chart and use that I don't mind.
I don't think this is unreasonable. Could you do it with the existing behaviour as the fallback if you don't explicitly supply both limits? We would welcome a PR like that.
I'm using Dask to coordinate some ML jobs, which have internal parallelism. I'd like to restrict my cluster so that only one job can run on a node at a time, even though nodes have multiple CPUs. However, this is currently impossible, since the Kubernetes worker resource limit is used to customize
--nthreads
: https://github.com/dask/helm-chart/blob/master/dask/templates/dask-worker-deployment.yaml#L37-L38This also prevents giving nodes e.g. 0.5 or 1.5 cpus, because even though Kubernetes allows it Dask will get confused and crash (i.e.
--nthreads
fails to parse as an integer.)This is moderately esoteric so if I should just fork the chart and use that I don't mind.