Open Kostusas opened 3 years ago
I think we should change the heuristic for determining how many workers we have available by checking the client configuration and scaling strategies.
You are running into this error: https://github.com/python-adaptive/adaptive/blob/f28bab073fed8723b0569fcfb6886fccc2133ecd/adaptive/runner.py#L403-L404
because you start with 0 cores.
If you change your argument from minimum=0
to minimum=1
, Adaptive does detect the scaling correctly.
Would this be good enough for you?
This seems to be a workaround, but I think actually detecting the configuration would be more reliable. Unfortunately I can't quite find the correct API in distributed.
I've asked whether there's a better way on stack overflow (AFAIR that's the preferred channel for dask): https://stackoverflow.com/q/69326568/2217463
Why would the maximal number of cores matter instead of the currently available cores?
It's a chicken and egg problem otherwise: the adaptive scaling of dask won't request new workers if there are no tasks in the queue.
Hmm, then we would already query some points that will not be calculated yet.
Why not change the following
to
elif with_distributed and isinstance(ex, distributed.cfexecutor.ClientExecutor):
ncores = sum(n for n in ex._client.ncores().values())
return max(1, ncores)
Minimal code to reproduce the error on local Jupyter notebook:
returns error:
The same thing happens when running on a cluster with manual scaling without giving enough time to connect to the workers. It seems adaptive does not see any workers and terminates the process.