Open jacobtomlinson opened 3 months ago
As a single data point, I prefer option 2. Internally, we have a DaskClusterConfig
class with a spec
method that just returns a dictionary in the format that custom_cluster_spec
expects.
This is nice because it lets you configure your cluster in an object oriented way instead of manipulating a complex, nested dictionary. It provides sensible defaults but you can easily override them. Everything is a property so you can instantiate first then set specific properties later. It pretty prints your config and even creates an interactive ipywidget when executed within a Jupyter notebook. It was inspired by Dask Gateway's Options class.
Yeah I think I'm leaning that way too.
Background
When creating a new
KubeCluster
the__init__()
/_start()
process happens in a couple of steps.make_cluster_spec()
DaskCluster
class (which was created bykr8s.asyncio.objects.make_class()
) from that dictionaryDaskCluster.create()
If users want to customize their cluster spec before creating it they can use
make_cluster_spec()
directly which takes the same kwargs asKubeCluster
and returns the dictionary that is generated in step 1, then they can modify the dict before passing it directly to step 2 via the kwargKubeCluster(custom_cluster_spec=...)
.Problem statement
When it comes to adding new kwargs to
make_cluster_spec()
we are very conservative because it is easy for that method to become extremely overwhelming. However as highlighted in #898 there may be customizations we want to do regularly like setting the memory limit on thedask worker
CLI args which is not very ergonomic to do by modifying the dict directly. It would be nice of the object returned frommake_cluster_spec()
was easier to customize than a simple dictionary.For example
Option 1
Instead of returning a dict from
make_cluster_spec()
we could return theDaskCluster
object from step 2. This object is dict-like so any existing code that modifies the dict should work as expected (although this will need to be verified).Pros
The existing
DaskCluster
class is a convenient place to add methods to simplify some of the common customizations. It will be created anyway, we would just be changing the step at which the user can intervene from 1 to 2.Cons
The
DaskCluster
object that would be returned bymake_cluster_spec()
would be an asynckr8s
object and potentiallyKubeCluster
is being used in a sync context, and we would then be extending it with sync methods to modify things.Returning the
DaskCluster
object leaks the implementation detail that we are usingkr8s
to interact with Kubernetes.It also means in theory a user could call
await cluster_spec.create()
which would create the Dask cluster, but it wouldn't handle any of the port forwarding, log retrieval, etc and couldn't be passed todask.distributed.Client
.Having
KubeCluster
andDaskCluster
, which serve different purposes but have similar names, may cause confusion with users.Option 2
We could introduce some new intermediate object with a name like
KubeClusterSpec
which would effectively be a subclass ofdict
but with whatever convenience methods we want. We would return this frommake_cluster_spec()
.We would need to ensure that
DaskCluster
can be passed aKubeClusterSpec
instead of adict
. There is already support inkr8s
for passing a few different things, so we should be able to just conform to one of those.Pros
Adding a new class would avoid any user confusion as users would continue to never interact with
kr8s
resources directly.The purpose of the new class is well scoped, it is a dictionary of the spec with some utility methods to simplify modifying the spec.
Cons
Introducing a new class into the chain increases complexity.