Closed sbueringer closed 3 years ago
Some ideas for 2.:
Assumption: the first step in updating a cluster to v1.22 is changing the version in KubeadmControlPlane and then later on MachineDeployment etc.. I think there's nothing we can do about other control plane providers.
Some ideas:
Hand over a client to the KubeadmControlPlane webhook. In the update validation we would block the update:
KubeadmControlPlane
refers to the current cluster, i.e. it's self-hosted. Options:
.status.nodeRef.name
and .status.nodeRef.uid
?)Open questions:
/cc @fabriziopandini @randomvariable @vincepri @CecileRobertMichon
I'm not sure that blocking in web hooks is a viable option for v0.3.x, because this most probably requires controller runtime changes, and I don't think we can get them in the version currently used in this branch.
This leaves us to being forced to block in the controllers.
/milestone v0.3
Regarding: "What do we want to do with management clusters which are already on v1.22?"
I tried to deploy CAPI v0.3.21 on Kubernetes v1.22.0-rc.0 and imho it's impossible to get it to work without any major changes to CAPI. So I think we can assume we don't have any existing v1.22 CAPI v0.3.x management cluster out there.
error registering secret controller: no matches for kind "MutatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
Follow-up: We have to upgrade cert-manager on main: #4983
k create secret generic -n capi-system capi-kubeadm-bootstrap-webhook-service-cert --from-file=tls.key --from-file tls.crt
reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.9/tools/cache/reflector.go:105: Failed to list *v1alpha3.Machine: Internal error occurred: error resolving resource
Note from the CAPI meeting: we should also update the following pages in the book:
/assign
I agree that a management cluster already on v1.22 probably already is going to need manual remediation.
@fabriziopandini @vincepri @randomvariable I would then start implementing the part in the KubeadmControlPlane controller. As we assume we don't have healthy v1.22 management clusters out there I would implement the following:
For if self hosted, I can think of the following options:
.status.nodeRef.{name,uid}
also exist in the current cluster via the managementCluster
clientMaybe I'm missing the one obvious and good solution :)
Discarded options:
Could we check if the Cluster API CRDs are installed and block the upgrade to v1.22?
Could we check if the Cluster API CRDs are installed and block the upgrade to v1.22?
We could, this would additionally also block when we update other management clusters (i.e. not only ourselves, in cases like mgmt cluster => mgmt cluster => workload cluster). But I think this is also a case which would be nice to cover (and it wouldn't be covered by my solutions).
So yup, this seems to be the best solution yet.
We can use the partial object metadata client to find the Cluster CRD and block there. We'd need a remote client to the workload cluster, try to retrieve the CRD with PartialObjectMetadata (see the convert references function as an example) and if the call is successful, assume it's a management cluster.
/close
@vincepri: Closing this issue.
User Story
As a user I would like to get an error as early as possible when trying to upgrade a management cluster to v1.22 (using CAPI v0.3.x)
Detailed Description
There are different ways how a Kubernetes v1.22 management cluster could be created/updated:
Anything else you would like to add:
Open questions
[Miscellaneous information that will assist in solving the issue.]
/kind feature