Closed alisterburt closed 1 year ago
Thanks @alisterburt for this proposal.
I went through the discussion in https://github.com/dask-contrib/dask-ctl/issues/61, and I'm not sure of the outcome. @jacobtomlinson do you think the change here are necessary or should we wait for a PR in dask-ctl
?
Thank you for taking the time to go through the PR/discussion @guillaumeeb!
I think the correct answer here depends on how you think dask-jobqueue users should ideally configure their clusters moving forwards. Is that through a jobqueue config or a dask-ctl config? There is a valid argument against having both, but...
I see the two PRs (this and the future one in dask-ctl) as orthogonal and complementary unless there is an effort to move to using dask-ctl as the only way to create/manage clusters in the dask ecosystem
fixed the silly test failure but workflow requires re-approval from a maintainer 🙂
@guillaumeeb @jacobtomlinson thank you both for the reviews!
@guillaumeeb as there currently is no consensus around whether this PR is wanted I will hold off on updating for now.
@jacobtomlinson I am completely on board with the centralisation of cluster lifecycle management in dask-ctl, one of the main reasons for this PR was to simplify the implementation of dask-ctl discovery in dask-jobqueue in a followup PR 🙂
The problem this PR solves is "How can admins preconfigure user environments with some cluster templates?". The discussion in dask-ctl solves the same problem but for everyone.
The goal of dask-ctl is to be a control plane for all Dask clusters. It wont be the only way to do things but it will be the most consistent and convenient way to do cluster lifecycle things.
I expect in the future that users can still use the individual libraries directly to "pull the cables and move the flaps" if they want to.
I think the main problem this PR solves is actually "how can users preconfigure multiple clusters of the same type without leaving jobqueue?". This is currently not possible using the jobqueue config and adding dask-ctl into the mix is a good (i.e. not ad-hoc) solution. I think not being able to do this is a deficiency of the current jobqueue config that should be solved in jobqueue.
I also think cluster creation will always be simplest by using the library directly. The value in dask-ctl is value add things like discovery, listing, aggregating, templating and single-pane-of-glass views.
My goal is to try and push as much cluster lifecycle logic as possible into dask-ctl so that it is all in one place and is easily reusable between subprojects. I dislike the idea of individual deployment projects implementing their own solutions to this problem, and other problems like it, rather than collaborating on a shared tool like dask-ctl.
We are on a journey with dask-ctl where hopefully one day it will be upstreamed to distributed or made directly available via the dask namespace like dask.distributed is today. But to get there we need folks to use it rather than working independently.
This is exciting! It feels like clearly delineating what dask-ctl and dask-foo should be responsible for (in the ideal case) moving forwards would be useful here. I'm still a bit unsure
My question to @alisterburt is what can we do to make dask-ctl better and reduce friction there? For example I totally sympathise with your point about the use of the keywords args and kwargs in the cluster spec. Maybe we want to change that to something easier to understand in a v2 spec? (this is why I versioned the spec, so we can iterate nicely)
For my own personal use dask-ctl is working wonderfully, I'm happy creating clusters from yaml paths for now and will be happier creating from names once we get dask-contrib/dask-ctl#61 going. As a developer making things for non-python folks most likely working in HPC, I really like the idea of developing against dask-ctl and creating/discovering dask-foo clusters from there.
This is how I imagine the top of programs I will provide to people
from dask_ctl import get_cluster, create_cluster
# cluster_name comes in from the outside world
try:
cluster = get_cluster(cluster_name)
except NameError:
cluster = create_cluster(cluster_name)
The friction for my use case comes in when asking users to configure a cluster and they need jobqueue and ctl rather than just jobqueue - I'm optimising to keep the set of things users interact with as minimal as possible
Overall I am happy to go either way in this repo and will PR to dask-ctl soon regardless but I do think there is merit to this including this PR in jobqueue 🙂
I think not being able to do this is a deficiency of the current jobqueue config that should be solved in jobqueue.
I think the core of my comments are that I think creating cluster templates is out of scope for dask-{jobqueue,kubernetes,cloudprovider,yarn,etc}
and we should be doing this in dask-ctl
.
To quote the Zen of Python There should be one– and preferably only one –obvious way to do it.
.
one of the main reasons for this PR was to simplify the implementation of dask-ctl discovery in dask-jobqueue
I haven't reviewed this PR line by line yet as I wanted to have the high level discussion first. However at a quick glance I don't think this satisfies the dask-ctl
expectations for from_name
. Two (or more) processes should both be able to call dask_jobqueue.SLURMCluster.from_name("awesomecluster")
and all get a pointer to exactly the same cluster. My reading of this PR would be that a new cluster would be spawned each time.
In this PR name
seems to be a template name. In a dask-ctl
context name
is the unique name of a cluster instance.
This is how I imagine the top of programs I will provide to people
That looks like a nice example. In dask-kubernetes
we do a check for an existing cluster with that name before trying to create a new one so the code example is further simplified. It would be nice to do this in more places.
from dask_ctl import create_cluster
cluster = create_cluster("/path/to/spec.yaml") # If 'foo' exists it connects to it, if not it creates it
# /path/to/spec.yaml
version: 1
module: "dask_kubernetes.operator"
class: "KubeCluster"
kwargs:
name: "foo"
Most people only want a client anyway so a one liner would be
client = create_cluster("/path/to/spec.yaml").get_client()
# They can still access the cluster object at 'client.cluster' if they want to call 'client.cluster.scale(10)' or something
Here's some examples of how you can use KubeCluster
.
from dask_kubernetes.operator import KubeCluster
cluster = KubeCluster(name="foo") # If 'foo' exists it connects to it, if not it creates it (most users want this)
# Users who don't want this can control creation/connection
cluster = KubeCluster.from_name("foo") # If 'foo' exists it connects to it, if not it raises an exception (this meets dask-ctl expectations)
cluster = KubeCluster(name="foo", create_mode="CONNECT_ONLY") # Equivalent to 'from_name'
cluster = KubeCluster(name="foo", create_mode="CREATE_ONLY") # If 'foo' exists it raises an exception, if not it creates it
The friction for my use case comes in when asking users to configure a cluster and they need jobqueue and ctl rather than just jobqueue
I totally understand, ultimately these are just all bits of Dask but we avoid installing them all by default to reduce bloat in the core package. But I understand that separate namespaces can add friction. Do you see this problem being more of an install-time issue or import-time issue? Would making dask-ctl
a dependency of dask-jobqueue
resolve the install-time friction?
I think the core of my comments are that I think creating cluster templates is out of scope for dask-{jobqueue,kubernetes,cloudprovider,yarn,etc} and we should be doing this in dask-ctl.
To quote the Zen of Python There should be one– and preferably only one –obvious way to do it..
Cool! Thank you for being patient, this is clear guidance I can get behind 🙂 let's close this PR and push for realising the ideal ecosystem then.
Two (or more) processes should both be able to call dask_jobqueue.SLURMCluster.from_name("awesomecluster") and all get a pointer to exactly the same cluster. My reading of this PR would be that a new cluster would be spawned each time.
Your reading is correct and I think my understanding of dask here is lacking, if I had multiple users in a HPC environment what are the advantages/disadvantages of having them connect to the same Client/Cluster vs creating their own? Naively I thought we would want users to run their own for resiliency (other users can't crash my thing) and to avoid overwhelming the task graph if there are multiple heavy users.
Thanks for the dask-kubernetes example - that's clean and flexible!
I understand that separate namespaces can add friction. Do you see this problem being more of an install-time issue or import-time issue? Would making dask-ctl a dependency of dask-jobqueue resolve the install-time friction?
Sure - I think ~all friction would be removed if dask-ctl were bundled with jobqueue/dask itself at install time and the jobqueue documentation used dask-ctl for cluster templates rather than its own config. @guillaumeeb would you/the other maintainers here accept PRs in this direction?
what are the advantages/disadvantages of having them connect to the same Client/Cluster vs creating their own
It's less about multiple users sharing a single cluster and more about multiple processes sharing a cluster (often sequentially, not at the same time). Workflow managers like Airflow have multiple stages in a pipeline which are each launched as a separate process or allocation with various levels of fan-in and fan-out. It's nice to have the first stage in your pipeline create a Dask cluster and each stage after that reuse the Dask resource and the final stage delete the Dask cluster.
It's also very helpful in some failure modes that are not common on HPC. HPC allocations have a wall time, so if you use dask-jobqueue
to create a cluster and the whole thing blows up (and the Python process dies without calling the finalizers via a SIGKILL) the Dask cluster will eventually be reaped by the resource manager, not much harm done. Kubernetes and Cloud don't have any concept of wall time so if you lose the Python process that created your cluster the cleanup is very manual and error-prone. Failures there can be costly if you can't quickly identify left-over Dask clusters and quickly delete them.
With the new KubeCluster
implementation I mentioned before you can quickly create a new object that points to your cluster and then call cluster.close()
to clean up. This is also a core purpose of dask-ctl
, being able to run dask cluster list
and dask cluster delete foo
is very powerful. Both of these commands depend on being able to recreate a cluster manager object to interact with the cluster.
if dask-ctl were bundled with jobqueue/dask itself at install time and the jobqueue documentation used dask-ctl for cluster templates rather than its own config
I would love to see this. What do you think @guillaumeeb?
You're a star @jacobtomlinson - I learned a bunch and the motivation behind centralising this management is now crystal clear, thanks for writing this up!
if dask-ctl were bundled with jobqueue/dask itself at install time and the jobqueue documentation used dask-ctl for cluster templates rather than its own config
I would love to see this. What do you think @guillaumeeb?
Well, wow, that's a pretty complete and detailed discussion here. I have to admit I didn't spend a lot of time looking at dask-ctl
... So my vision here is not yet broad enough. As first thoughts:
dask-ctl
bundled with dask-jobqueue
, I think you really well justified the ultimate goal there.dask-ctl
for dask-jobqueue
cluster management here, and why it can be useful.dask-jobqueue
yaml configuration? Should we drop (part of?) the underlying code at some point?Should we drop (part of?) the underlying code at some point?
I'm not suggesting removing any of the existing configuration in dask-jobqueue
. All the configuration does today is set the default kwargs for the HPCCluster
objects which totally makes sense.
My comment was more that the config here shouldn't be expanded to include lifecycle things like cluster templates. Instead we should work together to do this in dask-ctl
so that all dask-foo
projects can benefit from shared functionality.
Ok, perfectly fine for me!
Discussed in #543
This PR allows custom clusters of any type to be specified in the jobqueue yaml
Below is an example config file with a second PBS cluster which uses a different queue called 'gpu'
This can be instantiated with
JobQueueCluster.from_name()
This also works directly with the standard job scheduling systems
The main reasons for this PR are:
543 was closed because the features provided here can be made available through a dask-ctl specific config yaml. I think this PR is still necessary because implementing autodiscovery of dask-jobqueue clusters in dask-ctl requires some way of 'discovering' possible clusters. With this PR, we can simply iterate over the keys in the jobqueue config and instantiate the specified clusters using
JobQueueCluster.from_name()
.Additionally, the config in dask-jobqueue is less Python specific than the dask-ctl yaml spec - this may be less of a burden for configuration by HPC admins who are not experts in Python.
95 is similar functionality but implemented at the cluster class level so does not solve point 1 above.
cc @guillaumeeb because you were involved in earlier discussions
Thank you in advance for any time spent reviewing!