Closed mdbooth closed 6 months ago
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: mdbooth
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Name | Link |
---|---|
Latest commit | 801f5ef2ff819d9f34e9e561d2c8c0044a7688d6 |
Latest deploy log | https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/65f49cc79b5aab00084bea2a |
Deploy Preview | https://deploy-preview-1941--kubernetes-sigs-cluster-api-openstack.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
/test pull-cluster-api-provider-openstack-e2e-full-test /lgtm /hold
Feel free to hold cancel when the upgrade passed and when you think it's ready to be merged. Thanks for that work Matt, sorry I missed it in the initial iteration.
It looks like the full test failed to clean up a couple of ports on deletion. I'm not convinced that was a flake, but running it again to get another data point:
/test pull-cluster-api-provider-openstack-e2e-full-test
Unfortunately, if this is a bug on deletion we don't have controller logs from CI because it happens after the logs are collected.
Oh, we do have logs. The controller paniced: https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api-provider-openstack/1941/pull-cluster-api-provider-openstack-e2e-full-test/1768199842766524416/artifacts/clusters/clusterctl-upgrade-y11roz/logs/capo-system/capo-controller-manager/capo-controller-manager-66896777f9-bnz58/manager.log
/test pull-cluster-api-provider-openstack-e2e-full-test
/test pull-cluster-api-provider-openstack-e2e-full-test
This PR broke cluster deletion when the cluster had a bastion, and the only test which caught it was the upgrade test. This is a bit worrying.
/test pull-cluster-api-provider-openstack-e2e-full-test
/test pull-cluster-api-provider-openstack-e2e-full-test
Note that https://github.com/kubernetes-sigs/cluster-api-provider-openstack/pull/1951 fixes this much more comprehensively by pre-resolving everything about the port prior to creation, including its name.
That PR is based on this one, though, so we still need to merge this first.
/hold cancel
Thanks for this work, I'm glad you sorted this out. /lgtm
CreatePorts was creating a port name based on the index of the port in the current reconcile. This could be different to the absolute index of the port if ports had been partially created in a previous reconcile.
We fix this by passing all current state into CreatePorts so it can create an absolute index. This also ensures that partially created ports will be persisted on failure so we don't have to rely on adoption in the next reconcile.
Fixes: #1940
This PR includes a second commit which refactors and simplifies how we initialises Bastion resources. The motivation for it was that first commit introduces a bug which causes a panic due to an assumption about state having been initialised. The new commit makes this mistake harder to make. It's worth looking at separately.