Open cromefire opened 5 years ago
No disagreement from me. The only thing I see as a potential issue is that every new thing we add like this (resources, capabilities, plugins, extensions) do make the system more complex. Adding in things like this has to be done carefully and with a mind to future maintenance costs. With that in mind any contribution that added something like this would be great to see.
Similar to:
I like capabilities
as it's general enough to include any type of capability/feature. Another option would be tags
but I think capabilities
is a bit more descriptive
The only thing I see as a potential issue is that every new thing we add like this (resources, capabilities, plugins, extensions) do make the system more complex.
Maybe this can be partially prevented, because you can already filter by worker, so maybe there could just be some kind of preprocessing that automatically filters workers. Writing that filter code should really easy, I can do that If you wish, it shouldn't be more than 10-20 lines
There's work going on in https://github.com/dask/distributed/pull/2675 which may help enable this use-case.
I'm very interested in having this capability (pun intended! 😆 ) so you (@cromefire) might be interested to see if the new design will work to enable this use-case
@guillaumeeb - you might be interested in this issue as you seemed to like my idea of worker "pools"
If capabilities
were added you might not need the concept of worker pools as you could filter the workers by capability
directly rather than pool membership - e.g.
>>> pool_specs = {
... 'gpu': {
... 'worker': {
... "cls": Worker, "options": {"ncores": 1}},
... "capabilities": {"gpu": True}
... },
... 'cpu': {
... 'worker': {"cls": Worker, "options": {"ncores": 1}},
... },
... }
>>> worker_specs = {
... 'worker1': {
... "pool": "gpu",
... "capabilities": { # extra capabilities
... "memory": "8GB",
... },
... },
... 'worker2': {
... "pool": "cpu",
... "capabilities": { # extra capabilities
... "memory": "16GB",
... },
... },
... }
>>> cluster = SpecCluster(workers=worker_specs, pools=pool_specs)
e.g. the below two calls would run on the same set of workers:
>>> res = cluster.submit(func, *args, capabilities={"memory": "16GB"})
>>> res = cluster.submit(func, *args, pool="cpu")
...so the concept of a "pool" is just to specify a certain set of (minimum?) capabilities. As such I think it adds value (usability) to the underlying concept of capabilities
.
In the workser_spec
above I've allow for the possibility for a worker to belong to a specific pool but to also provide capabilities over and above those required by the pool.
>>> worker_specs = { ... 'worker1': { ... "pool": "gpu", ... "capabilities": { # extra capabilities ... "memory": "8GB", ... }, ... }, ... 'worker2': { ... "pool": "cpu", ... "capabilities": { # extra capabilities ... "memory": "16GB", ... }, ... }, ... }
It seems like it's determined by the scheduler in that case, am I right? In my case the worker determines the capabilities by it self
But the concept is good
It would be really nice if you could add something like capabilities/labels (however you want to call them) to
dask
It would be something like:
Where you could schedule a task like:
So that certain tasks only get scheduled on certain workers, which support that kind of task
Additionally another feature to consider would be enum-like capabilities:
With a python equivalent of:
The current equivalent of capabilities would be something like: