ionos-cloud / cluster-api-provider-proxmox

Cluster API Provider for Proxmox VE (CAPMOX)
Apache License 2.0
177 stars 23 forks source link

ProxmoxMachine objects are not finalized #265

Closed natitomattis closed 1 day ago

natitomattis commented 1 month ago

What steps did you take and what happened: When deleting a machine, the Proxmox machine object is not terminated properly. I see the following errors in the controller

Reconciler error" err="cannot delete vm with id 102: 500 Internal Server Error" controller="proxmoxmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="ProxmoxMachine" ProxmoxMachine="backend-at1-dev/backend-at1-dev-accelerated-847gn" namespace="backend-at1-dev" name="backend-at1-dev-accelerated-847gn" reconcileID="934e466a-aef9-4467-b0b2-35a3ec3a043b"

It seems like the machine was removed in a previous call and now the controller is failing to find it.

What did you expect to happen: ProxmoxMachine should be removed

Anything else you would like to add:

Environment:

mcbenjemaa commented 3 weeks ago

Thanks for reporting this, we will check this.

65278 commented 3 weeks ago

The bug report has a couple of things which don't add up:

Reconciler error" err="cannot delete vm with id 102: 500 Internal Server Error" controller="proxmoxmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="ProxmoxMachine" ProxmoxMachine="backend-at1-dev/backend-at1-dev-accelerated-847gn" namespace="backend-at1-dev" name="backend-at1-dev-accelerated-847gn" reconcileID="934e466a-aef9-4467-b0b2-35a3ec3a043b"

suggests that the failure occured on this line: https://github.com/ionos-cloud/cluster-api-provider-proxmox/blob/main/pkg/proxmox/goproxmox/api_client.go#L167

This means the api call to proxmox found the machine (or at least didn't error on it): https://github.com/luthermonson/go-proxmox/blob/main/nodes.go#L67

Subsequently, delete fails with an internal server error in proxmox. While we should handle http 500 here, the controller can't do anything more on this, because proxmox is in an undefined state.

What'd be helpful is if you issued the http requests for find and delete directly to your proxmox api and posted the results here. Here's some help on how to issue api requests via http: https://pve.proxmox.com/wiki/Proxmox_VE_API#API_Tokens

mcbenjemaa commented 1 week ago

We have merged #278. I'm hoping this will fix the issue. In my tests, I haven't encountered this issue after merging.

wikkyk commented 1 day ago

I will close this issue for now, please reopen if that did not fix the bug.