ClusterClass and managed topologies

vincepri commented 3 years ago

In the past few months we've been talking at community meetings about the possibility of adding two new concepts to our APIs: ClusterClass and a more useful Cluster object that lets folks describe a fully functional cluster.

ClusterClass

A ClusterClass CRD would be introduced to provide easy stamping of clusters of similar shapes. The early version of this Class would be containing an immutable set of references to templating objects, for example:

apiVersion: cluster.x-k8s.io/v1alpha4
kind: ClusterClass
metadata:
  name: example-1
spec:
  infrastructure:
    clusterRef:
  controlPlaneRef:
  machines:
    <name of the role>: reference ## This might be un-necessary, we'd have to see if we want to include it or not.

The above is mostly an example, details of how the reference system should work are left to the proposal.

Cluster

The Cluster object has been used to this day mostly to tie things up together in a logical composition of resources. In this proposal we want to improve the day zero user experience by providing new capabilities as part of the Cluster object. More details are left to the proposal itself, although we'd like to allow a Cluster to be created with a resource that looks as simple as the example below:

  kind: Cluster
  metadata:
    name: azure-1
    namespace: eng
  spec:
    class: "dev-azure-simple-small"
    version: v1.19.1
    managed:
      controlPlane:
        replicas: 3
      workers:
        - role: Worker1
           replicas: 10

In the example above, the Cluster controller is expected to take all the information present in the ClusterClass associated (in spec.class) and create managed resources: a control plane with 3 replicas and one pool of worker nodes with 10 replicas.

The other bit important in here is the spec.version field. This new field, for the first iteration, is managing the Kubernetes version for a managed cluster topology. When the version changes, for example in the event of an upgrade, the controller is in charge of declaring the new versions for both control planes and worker pools.

/kind proposal /area api /milestone v0.4

vincepri commented 3 years ago

/cc @CecileRobertMichon @detiber @fabriziopandini @timothysc @srm09

timothysc commented 3 years ago

I'm not hip with the latest on cluster auto-scalar, but it would seem natural for me to declare intent and limits at this level. Feel free to argue with me, I'm just spitballing.

vivgoyal commented 3 years ago

Couple of questions here:

What is meant by clusters of similar shapes? If it is what I think of, wouldn't it be provider specific?
What value does adding the extra fields to Cluster CRD add here? Pretty much seems like we are moving few fields from machine CRD to cluster CRD or may be even just copying them.

Overall this looks a good thing to do from consumers perspective.

srm09 commented 3 years ago

The primary idea behind this proposal is to improve the UX and reusability aspect around cluster creation. Currently the Cluster CRD has too much detail around the ControlPlane and Infra definition for the cluster. Thus the cluster CRD cannot be reused to create multiple clusters of the same type (which is basically what it means by creating clusters of similar shapes) To enable this use case, ClusterClass would be a construct which can be defined once, and used multiple times to create similar clusters with different k8s versions and/or different CP/worker node setup.

Around enhancing the UX out, the Cluster object can be used to answer simpler/frequent questions around size of the cluster, k8s version and so on which currently are harder to answer at this point in time.

nickperry commented 3 years ago

As an end-user, I think this looks like it could be helpful for us. We would need each role to be able to have its own infrastructure machine template and KubeadmConfigTemplate.

frapposelli commented 3 years ago

I like the idea proposed, looking forward to seeing the proposal 👍🏻

One thing that comes to mind is whether the version: and class: properties in the revised Cluster object should be scoped under controlplane and each workers section.

timothysc commented 3 years ago

One thing that comes to mind is whether the version: and class: properties in the revised Cluster object should be scoped under controlplane and each workers section.

I'd expect at this level of consumption the user should not care. They simply want to change the version and the cluster just upgrades.

vincepri commented 3 years ago

I'd expect at this level of consumption the user should not care. They simply want to change the version and the cluster just upgrades.

+1 — The expectation is that MachineDeployment classes are already there in the form of infra and bootstrap templates. The version field instead is a way to declare the Kubernetes version for the whole cluster managed topology.

shysank commented 3 years ago

Is ClusterClass itself immutable? As in, if I want to change a parameter in cc, should I just clone a new one and update that parameter? nvm just re-read it that it's immutable.

vincepri commented 3 years ago

That was my initial thinking, at least to start with. As soon as we get into mutable territory, things are much get much complicated. Each template reference would have to stay the same, although folks might want to mutate the templates themselves (which could be the next logical step right after the initial implementation)

yastij commented 3 years ago

I think that from a UX perspective there two things we might want to figure out:

corner case behaviours, e.g. if someone upgrades KCP instead of using the cluster object
status reporting: we need to select which conditions we need to report, otherwise we'll overload users with info

CecileRobertMichon commented 3 years ago

Does this proposal also aim to solve https://github.com/kubernetes-sigs/cluster-api/issues/3203? If so, how would we handle updates to the OS image in the InfraMachine template required for k8s version upgrades in a lot of scenarios?

cc @fiunchinho

elmiko commented 3 years ago

I'm not hip with the latest on cluster auto-scalar, but it would seem natural for me to declare intent and limits at this level. Feel free to argue with me, I'm just spitballing.

assuming you are talking about the total cluster limits (eg mem, cpu, nodes, etc), i agree and i think it pushes us towards the notion of having some sort of controller (or similar) that could automate deployment of the cluster autoscaler.

srm09 commented 3 years ago

Following the discussion we had during the office hours today, handling the version directly in the Cluster object would greatly improve the UX around handling cluster k8s version upgrades. To enable that we would need to pass a template to be used by the KCP (for the Control Plane nodes) and the MachineDeployments(for the various worker nodes)

cc: @CecileRobertMichon That should solve the issue of updates to OS images during k8s version upgrades

srm09 commented 3 years ago

/assign /lifecycle active

frapposelli commented 3 years ago

I'd expect at this level of consumption the user should not care. They simply want to change the version and the cluster just upgrades.

+1 — The expectation is that MachineDeployment classes are already there in the form of infra and bootstrap templates. The version field instead is a way to declare the Kubernetes version for the whole cluster managed topology.

Makes sense for version, even though you might want to handle patches rollout in a staggered fashion per each set of workers (like in large clusters) I understand that it's not a common use case that should be served by this approach.

With regards to class, I'm particularly thinking about situations where you might have a heterogeneous set of workers (like Linux workers + Windows workers), which will probably require a different set of infra and bootstrap templates. But maybe I'm overthinking it.

srm09 commented 3 years ago

apiVersion: cluster.x-k8s.io/v1alpha4
kind: ClusterClass
metadata:
  name: example-1
  namespace: blah
spec:
  infrastructure:
    clusterRef:
    workerNodeTypes:
      - type: worker-linux
        # optional
        infrasturctureRef: linux-template-1
      - type: worker-windows
         # optional
         infrasturctureRef: windows-template-1
  controlPlaneRef:
    # optional
    infrastructureRef: linux-template-cp

Cluster would look like:

 kind: Cluster
  metadata:
    name: azure-1
    namespace: eng
  spec:
    class: 
      name: "example-1"
      namespace: "blah"
      version: v1.19.1
      template: linux-template-1
    managed:
      controlPlane:
        replicas: 3
      workers:
        - role: worker-linux
           replicas: 10
        - role: worker-windows
           replicas: 10
           template: windows-template-1

The idea here is that the spec.class.template will be used as the infra-template when nothin is specified.

frapposelli commented 3 years ago

Thanks @srm09 that makes a lot of sense

sbueringer commented 3 years ago

@srm09 When are the spec.infrastructure.workerNodeTypes[].infrastructureRef used in your example? Sounded like they are either explicitly specified in the workers array in the Cluster resource or spec.class.template is used?

But I think it's okay for now. The idea overall is clear and we can specify the details in a proposal.

enxebre commented 3 years ago

This is interesting. Some questions to better understand goals and benefits: What's the additional value of a clusterClass over what we have abstracted already with the infraTemplate? What's the additional value of having a new abstraction over having new fields/controllers e.g in the existing cluster resource? How is having the proposed semantics for the cluster topology e.g

    managed:
      controlPlane:
        replicas: 3
      workers:
        - role: Worker1
           replicas: 10

related to extracting the infra and controlPlane ref into a different CRD?

fiunchinho commented 3 years ago

I think having something like the ClusterClass makes sense, as it helps to manage a fleet of clusters.

We have a somewhat similar approach within our company, based on a custom CRD that holds

the k8s version
the etcd version
the operating system version

We create a CR for a specific setup (k8s/etcd/OS versions) and link it to a Cluster using an annotation. The annotation holds the name of the CR. A change on the annotation to use a different CR may trigger an upgrade of the Cluster (i.e if the k8s version needs to be changed) due to a custom controller watching Clusters and reconciling those components versions.

From this proposal I'm not sure about the Cluster.spec.managed or Cluster.spec.version fields. I think it makes sense to either specify a class or use the current normal Cluster fields referencing to templates, but I'm not sure about both setting Cluster.spec.version and Cluster.spec.class which references templates that specify the k8s version too.

srm09 commented 3 years ago

@enxebre

What's the additional value of a clusterClass over what we have abstracted already with the infraTemplate? The idea behind ClusterClass is to provide a single template for outlining the bits that would be used to create the cluster. And this same template can be used to create multiple clusters with different number of CP and worker nodes.

How is having the proposed semantics for the cluster topology related to extracting the infra and controlPlane ref into a different CRD? I quite do not understand this question. I am happy to chat about it more on Slack/here too.

srm09 commented 3 years ago

@fiunchinho The idea behind adding a k8s version in the spec.class is to provide a knob which will override the k8s version specified on the KCP and the MachineDeployments controlling the worker node pools in the cluster. For each entry in spec.class.Managed (CP and worker nodes alike), spec.class.version would drive the k8s version.

detiber commented 3 years ago

I'd really like to see the detailed proposal before I provide too much feedback, but one initial concern that I have is that there doesn't seem to be support at the Cluster level here to support an upgrade similar to what is provided by KubeadmControlPlane with UpgradeAfter. Since there are cases where new base images could be published under the same default lookup scheme for a given infrastructure provider that provide a newer patched version of the base OS or Kubernetes dependencies without explicitly overriding the full image name (This is currently possible with the AWS provider today).

There was a discussion about adding similar support to MachineDeployments/MachineSets/MachinePools, it would be great to see it covered in this proposal as well.

vincepri commented 3 years ago

What's the additional value of a clusterClass over what we have abstracted already with the infraTemplate?

ClusterClass abstracts away not only Machines, but also control plane configuration and cluster infrastructure. We don't have support for that today.

What's the additional value of having a new abstraction over having new fields/controllers e.g in the existing cluster resource?

Apologies if I didn't understand the question. The ClusterClass abstraction by itself doesn't really provide much value. The overall arch of this issue-proposal is to ease the getting started / day 0 operations and get to a Kubernetes cluster by creating the smallest amount of configuration as possible.

If infrastructure provider can provide custom ClusterClass(es) in an inventory of some sorts, clusterctl (or even better, the operator) can prefill some default classes for users to get started. A new user would only have to create the first Cluster object (not counting the prerequisites here, like setting up cloud provider credentials) and then the Cluster object would take care of creating the rest.

It's a stepping stone which is ultimately going to guide users to learn the more advanced Cluster API objects over time, rather than overwhelming new users from the start.

From this proposal I'm not sure about the Cluster.spec.managed or Cluster.spec.version fields. I think it makes sense to either specify a class or use the current normal Cluster fields referencing to templates, but I'm not sure about both setting Cluster.spec.version and Cluster.spec.class which references templates that specify the k8s version too.

The top level Kubernetes version would take precedence over all the inner version fields, the objects would be set or updated appropriately.

There was a discussion about adding similar support to MachineDeployments/MachineSets/MachinePools, it would be great to see it covered in this proposal as well.

Huge +1, potentially we should still add it to each single objects (and maybe call it RolloutAfter rather than Upgrade), but that could be tackled in a separate issue.

smcaine commented 3 years ago

Would this issue also be tackling the lifecycle process mentioned above in #3203 ? This would be a big step forward for us to be able to manage cluster lifecycle declaratively in git and have an operator manage the order of which KCP/MachineDeployments are upgraded for example.. also having some way to update machinetemplates without having to create new ones, then clean up after they are updated would be a big plus too.

vincepri commented 3 years ago

@smcaine Yes, in some ways it does but only for the managed topologies (control plane + workers defined as part of the Cluster object). If users create Machine, MachineSet, or MachineDeployment objects outside of that we won't be upgrading those versions.

kubernetes-sigs / cluster-api

ClusterClass and managed topologies #4430

ClusterClass

Cluster