vexxhost / magnum-cluster-api

Cluster API driver for OpenStack Magnum
Apache License 2.0
47 stars 22 forks source link

nodegroup stuck in DELETE_IN_PROGRESS #411

Closed gtirloni closed 4 months ago

gtirloni commented 4 months ago

a nodegroup that was deleted is stuck in DELETE_IN_PROGRESS:

+--------------------+---------------------------------------------------------+
| Field              | Value                                                   |
+--------------------+---------------------------------------------------------+
| uuid               | de5d4996-697d-46cf-af7e-656110b668b8                    |
| name               | mynodegroup                                             |
| cluster_id         | 416cfba3-eeb6-4e9b-8a56-8c727b7bc2aa                    |
| project_id         | 34b7c04ca35548cf8c9998ebe69b650e                        |
| docker_volume_size | None                                                    |
| labels             | {'kube_tag': 'v1.27.3', 'api_server_cert_sans': 'xxx    |
|                    | xxxxx',                                                 |
|                    | 'auto_scaling_enabled': 'false'}                        |
| labels_overridden  | {}                                                      |
| labels_skipped     | {}                                                      |
| labels_added       | {}                                                      |
| flavor_id          | m1.xlarge.k8s                                           |
| image_id           | c43ca227-1c2d-4518-9f17-31a0d08e1eac                    |
| node_addresses     | []                                                      |
| node_count         | 5                                                       |
| role               | worker                                                  |
| max_node_count     | 5                                                       |
| min_node_count     | 5                                                       |
| is_default         | False                                                   |
| stack_id           | None                                                    |
| status             | DELETE_IN_PROGRESS                                      |
| status_reason      | None                                                    |
+--------------------+---------------------------------------------------------+

magnum-conductor logs:

kube_config_path not provided and default location (~/.kube/config) does not exist. Using inCluster Config. This might not work.
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall [-] Fixed interval looping call 'magnum.service.periodic.ClusterUpdateJob.update_status' failed: magnum_cluster_api.exceptions.MachineDeploymentNotFound: MachineDeployment mynodegroup not found.
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall Traceback (most recent call last):
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall   File "/var/lib/openstack/lib/python3.10/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall     result = func(*self.args, **self.kw)
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall   File "/var/lib/openstack/lib/python3.10/site-packages/magnum/service/periodic.py", line 73, in update_status
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall     cdriver.update_cluster_status(self.ctx, self.cluster)
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall   File "/var/lib/openstack/lib/python3.10/site-packages/magnum_cluster_api/driver.py", line 39, in wrapper
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall     return func(*args, **kwargs)
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall   File "/var/lib/openstack/lib/python3.10/site-packages/magnum_cluster_api/driver.py", line 167, in update_cluster_status
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall     ] + self.update_nodegroups_status(context, cluster)
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall   File "/var/lib/openstack/lib/python3.10/site-packages/magnum_cluster_api/driver.py", line 457, in update_nodegroups_status
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall     md_spec = cluster_resource.get_machine_deployment_spec(node_group.name)
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall   File "/var/lib/openstack/lib/python3.10/site-packages/magnum_cluster_api/objects.py", line 274, in get_machine_deployment_spec
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall     self.get_machine_deployment_index(name)
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall   File "/var/lib/openstack/lib/python3.10/site-packages/magnum_cluster_api/objects.py", line 269, in get_machine_deployment_index
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall     raise exceptions.MachineDeploymentNotFound(name=name)
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall magnum_cluster_api.exceptions.MachineDeploymentNotFound: MachineDeployment mynodegroup not found.
2024-07-17 12:06:51.369 1 ERROR oslo.service.loopingcall