kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.58k stars 1.31k forks source link

v0.3.x: Block creation of /upgrades to Kubernetes v1.22 management clusters #4966

Closed sbueringer closed 3 years ago

sbueringer commented 3 years ago

User Story

As a user I would like to get an error as early as possible when trying to upgrade a management cluster to v1.22 (using CAPI v0.3.x)

Detailed Description

There are different ways how a Kubernetes v1.22 management cluster could be created/updated:

  1. create a new v1.22 workload cluster + clusterctl move / init / ..
  2. upgrade a self-hosted <v1.22 management cluster via CAPI to v1.22

Anything else you would like to add:

  1. Should be easy to implement by adding a version check in clusterctl v0.3.x
  2. There is currently no easy way to detect if the management cluster tries to upgrade a workload cluster or itself. Happy for suggestions, otherwise I'll explore some options.

Open questions

[Miscellaneous information that will assist in solving the issue.]

/kind feature

sbueringer commented 3 years ago

Some ideas for 2.:

Assumption: the first step in updating a cluster to v1.22 is changing the version in KubeadmControlPlane and then later on MachineDeployment etc.. I think there's nothing we can do about other control plane providers.

Some ideas:

Hand over a client to the KubeadmControlPlane webhook. In the update validation we would block the update:

Open questions:

sbueringer commented 3 years ago

/cc @fabriziopandini @randomvariable @vincepri @CecileRobertMichon

fabriziopandini commented 3 years ago

I'm not sure that blocking in web hooks is a viable option for v0.3.x, because this most probably requires controller runtime changes, and I don't think we can get them in the version currently used in this branch.

This leaves us to being forced to block in the controllers.

vincepri commented 3 years ago

/milestone v0.3

sbueringer commented 3 years ago

Regarding: "What do we want to do with management clusters which are already on v1.22?"

I tried to deploy CAPI v0.3.21 on Kubernetes v1.22.0-rc.0 and imho it's impossible to get it to work without any major changes to CAPI. So I think we can assume we don't have any existing v1.22 CAPI v0.3.x management cluster out there.

Exploration CAPI v0.3.21 on Kubernetes v1.22-rc.0

sbueringer commented 3 years ago

Note from the CAPI meeting: we should also update the following pages in the book:

sbueringer commented 3 years ago

/assign

randomvariable commented 3 years ago

I agree that a management cluster already on v1.22 probably already is going to need manual remediation.

sbueringer commented 3 years ago

@fabriziopandini @vincepri @randomvariable I would then start implementing the part in the KubeadmControlPlane controller. As we assume we don't have healthy v1.22 management clusters out there I would implement the following:

For if self hosted, I can think of the following options:

Maybe I'm missing the one obvious and good solution :)

Discarded options:

vincepri commented 3 years ago

Could we check if the Cluster API CRDs are installed and block the upgrade to v1.22?

sbueringer commented 3 years ago

Could we check if the Cluster API CRDs are installed and block the upgrade to v1.22?

We could, this would additionally also block when we update other management clusters (i.e. not only ourselves, in cases like mgmt cluster => mgmt cluster => workload cluster). But I think this is also a case which would be nice to cover (and it wouldn't be covered by my solutions).

So yup, this seems to be the best solution yet.

vincepri commented 3 years ago

We can use the partial object metadata client to find the Cluster CRD and block there. We'd need a remote client to the workload cluster, try to retrieve the CRD with PartialObjectMetadata (see the convert references function as an example) and if the call is successful, assume it's a management cluster.

vincepri commented 3 years ago

/close

k8s-ci-robot commented 3 years ago

@vincepri: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/4966#issuecomment-915415431): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.