Closed huxcrux closed 1 month ago
How are you triggering this?
I deliberately removed the defaulting webhooks in #1920 because none of them had an implementation.
Ah... I'll bet I know what happened. Note that #1920 removes the MutatingWebhookConfiguration
from the manifests which references the defaulting webhook. However, if you're upgrading and you just apply the new manifests there's going to be nothing to remove it.
I'm not sure how to handle that without an operator, tbh. Definitely needs a release note!
To start with, please can you confirm that deleting CAPO's MutatingWebhookConfiguration
fixes the problem. If we can confirm that we can think about how to automate it, or at least ensure folks know about it.
I deliberately removed the defaulting webhooks in #1920 because none of them had an implementation.
Ah... I'll bet I know what happened. Note that #1920 removes the
MutatingWebhookConfiguration
from the manifests which references the defaulting webhook. However, if you're upgrading and you just apply the new manifests there's going to be nothing to remove it.I'm not sure how to handle that without an operator, tbh. Definitely needs a release note!
To start with, please can you confirm that deleting CAPO's
MutatingWebhookConfiguration
fixes the problem. If we can confirm that we can think about how to automate it, or at least ensure folks know about it.
If I remove the mutateing webhook it solves the problem. Also installing a new management cluster directly from main "fixes" the problem since the webhook is never created meaning this is just a problem when upgrading CAPO to main ATM
It seems that clusterctl upgrade
is smart about this and that is why the upgrade test was successful. That means it will "only" be an issue when using kubectl
or other tools instead of clusterctl
.
$ k get mutatingwebhookconfigurations.admissionregistration.k8s.io
NAME WEBHOOKS AGE
capi-kubeadm-bootstrap-mutating-webhook-configuration 2 14s
capi-kubeadm-control-plane-mutating-webhook-configuration 2 14s
capi-mutating-webhook-configuration 9 14s
capo-mutating-webhook-configuration 3 13s
cert-manager-webhook 1 36s
$ clusterctl upgrade apply --contract=v1beta1 --config=clusterctl.yaml
Checking cert-manager version...
Cert-manager is already up to date
Performing upgrade...
Scaling down Provider="infrastructure-openstack" Version="v0.9.0" Namespace="capo-system"
Deleting Provider="infrastructure-openstack" Version="v0.9.0" Namespace="capo-system"
Installing Provider="infrastructure-openstack" Version="v0.9.99" TargetNamespace="capo-system"
sigs.k8s.io/cluster-api
$ k get mutatingwebhookconfigurations.admissionregistration.k8s.io
NAME WEBHOOKS AGE
capi-kubeadm-bootstrap-mutating-webhook-configuration 2 73s
capi-kubeadm-control-plane-mutating-webhook-configuration 2 73s
capi-mutating-webhook-configuration 9 73s
cert-manager-webhook 1 95s
It seems that
clusterctl upgrade
is smart about this and that is why the upgrade test was successful. That means it will "only" be an issue when usingkubectl
or other tools instead ofclusterctl
.$ k get mutatingwebhookconfigurations.admissionregistration.k8s.io NAME WEBHOOKS AGE capi-kubeadm-bootstrap-mutating-webhook-configuration 2 14s capi-kubeadm-control-plane-mutating-webhook-configuration 2 14s capi-mutating-webhook-configuration 9 14s capo-mutating-webhook-configuration 3 13s cert-manager-webhook 1 36s $ clusterctl upgrade apply --contract=v1beta1 --config=clusterctl.yaml Checking cert-manager version... Cert-manager is already up to date Performing upgrade... Scaling down Provider="infrastructure-openstack" Version="v0.9.0" Namespace="capo-system" Deleting Provider="infrastructure-openstack" Version="v0.9.0" Namespace="capo-system" Installing Provider="infrastructure-openstack" Version="v0.9.99" TargetNamespace="capo-system" sigs.k8s.io/cluster-api $ k get mutatingwebhookconfigurations.admissionregistration.k8s.io NAME WEBHOOKS AGE capi-kubeadm-bootstrap-mutating-webhook-configuration 2 73s capi-kubeadm-control-plane-mutating-webhook-configuration 2 73s capi-mutating-webhook-configuration 9 73s cert-manager-webhook 1 95s
This indeed is a problem when just "applying" the new manifests. clusterctl upgrade seems to get around this by just removing resources and then add the new ones. We should probably add something in the releasenotes since this is breaking for anyone not using clsuterctl to upgrade CAPO.
For reference we have discussed this on slack where more details can be found: https://kubernetes.slack.com/archives/C8TSNPY4T/p1709672361982489
So this is 'not a bug', but it's a serious 'not a bug' so lets keep this open until we've ensured there is a release note about it.
@jichenjc I'm again thinking it would be really nice to have a workflow from GitHub which results in a Release Note. I think CPO has this. Any idea how it works?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
This was done.
/kind bug
What steps did you take and what happened: It seems like https://github.com/kubernetes-sigs/cluster-api-provider-openstack/pull/1920 broke the defaulting webhook for openstackcluster in v1beta1.
Error during apply:
I suspect there might need to be some changes in webhook generation since it seems to be removed completely when checking the following file: https://github.com/kubernetes-sigs/cluster-api-provider-openstack/commit/e6bb34f3bd6fa141c7c094f62b537891f716b4a7#diff-369b61dd1f2f1f60722927fda70e5a51e130cea68336e6d536086b522ffef0c6
Environment:
git rev-parse HEAD
if manually built): main (current)kubectl version
):/etc/os-release
):