Closed fabriziopandini closed 1 month ago
This issue is currently awaiting triage.
CAPI contributors will take a look as soon as possible, apply one of the triage/*
labels and provide further guidance.
I have been giving this some thought recently specifically in the context of CAPO, but also with a view to how it could be implemented more generally. The two principal problems we have with the current implementation are:
In OpenStack specifically, a 'failure domain' can in practice be an arbitrarily complex set of configurations spanning separate configurations for at least compute, storage, and network. In order to use MachineSpec.FailureDomain we would effectively have to make this a reference to some other data structure. This dramatically increases complexity for both developers and users.
As failure domains are arbitrarily complex configuration, they can change over time. There is currently no component which can recognise that a machine is no longer compliant with its failure domain and perform some remediation.
In OpenShift we have the Control Plane Machine Set operator (CPMS). This works well for us, but this is because, being in OpenShift, it can take a number of liberties which are unlikely to be acceptable in CAPI, specifically the following are baked directly into the controller:
However, this is the extent of the provider-specific code in CPMS. It's quite a simple interface.
I had an idea that we might be able to borrow ideas from CPMS and the kube scheduler to implement something relatively simple but very flexible. What follows is very rough. It's intended for discussion rather than as a concrete design proposal.
The high level overview is that we would add a FailureDomainPolicyRef to MachineSpec. If a Machine has a FailureDomainPolicyRef, the Machine controller will not create an InfrastructureMachine until the MachineSpec also has a FailureDomainRef.
A user might create:
MachineTemplate:
spec:
template:
spec:
...
failureDomainPolicyRef:
apiVersion: ...
kind: DefaultCAPIFailureDomainPolicy
name: MyClusterControlPlane
DefaultCAPIFailureDomainPolicy:
metadata:
name: MyClusterControlPlane
spec:
spreadPolicy: Whatever
failureDomains:
apiVersion: ...
kind: OpenStackFailureDomain
names:
- AZ1
- AZ2
- AZ3
OpenStackFailureDomain
metadata:
name: AZ1
spec:
computeAZ: Foo
storageAZ: Bar
networkAZ: Baz
If OpenStackFailureDomain is immutable, it can only be 'changed' by creating a new one and updating the failure domain policy.
The failure domain policy controller would watch Machines with a failureDomainPolicyRef. It would assign a failureDomain from the list according to the configured policy. It also has the opportunity to notice that a set of Machines is no longer compliant with the policy and remediate by deleting machines so new, compliant machines can replace them.
Because the failure domain is now a reference to a provider-specific CRD, the infrastructure machine controller can take provider-specific actions to apply the failure domain to an infrastructure machine.
For users who don't need this complexity, the infrastructure cluster controller could create a default policy much the way it does now which could be applied to a KCP machine template.
A design like this in the MachineSpec would also have the advantage that it could be used without modification for any set of machines. So, for example, users who want to spread a set of workers in an MD across 2 FDs would be able to do that.
I believe something like this would also be effective for vSphere, where failure domains are also complex as one cluster could in theory span multiple clusters. Not sure exactly how this is handled in CAPV today.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
Grouping a couple of issue/ideas about failure domain which are not getting attention from the community
To address this issue we need a proposal that looks into how to handle operations for failure domain (going behind the initial placement of machines currently supported)
https://github.com/kubernetes-sigs/cluster-api/issues/4031
https://github.com/kubernetes-sigs/cluster-api/issues/5667
https://github.com/kubernetes-sigs/cluster-api/issues/7417
/kind feature One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.