siderolabs / omni

SaaS-simple deployment of Kubernetes - on your own hardware.
Other
402 stars 23 forks source link

[bug] Omnictl delete cluster does not run the volume finalizers #214

Open bernardgut opened 2 months ago

bernardgut commented 2 months ago

Is there an existing issue for this?

Current Behavior

  1. create a cluster template that includes a CSI provisioner (piraeus in my case) and some volumes (metrics in my case)
  2. omnictl sync ...
  3. omnictl delete cluster ...
  4. run 2. and 3. 10 times in a row
  5. Your CSI provisioner will probably not pick up the volumes from the 9 last cluster attempts in the storage layer as they have no corresponding resource assignment in your CSI, but they are still there. If you node-shell into the machine and go check what is on the disks you will see the previous attempts PVs data is still there and your storage space being wasted.

Expected Behavior

I am not sure if this is a feature or a bug so I opened a ticket anyway. If this is a feature please add a --run-storage-finalizers flag to omnictl delete cluster such that when it deletes the cluster PVs, it frees the space from the storage layer. We should avoid dangling PV data remaining on the underlying storage forever.

Steps To Reproduce

See Description above

What browsers are you seeing the problem on?

No response

Anything else?

Talos 1.7.1 Omni 0.34.0

bernardgut commented 2 months ago

BTW this is not a trivial problem to solve as you have to delete the storage claimed before you delete the CSI pods.

The only way to do this "trivially" that I can see on top of my head would be to have a script that does the following when you run omnictl delete cluster ...

  1. delete "anything" (deamonsets/deployments/statefusets/pods) that has a pvc bound to it first.
  2. delete all pvc, delete all pv
  3. then delete everything else

Same for talosctl reset... whatever you have planned, I think it needs to run the finalizers on the PVCs that are assigned to the node before deleting the node:

  1. cordon node
  2. evict workloads
  3. delete all pvc (which run finalizers with timeout)
  4. reset taos + kubectl delete node ...

something along those lines, IMO.

Unix4ever commented 2 months ago

I wouldn't say that's a bug. It looks like a feature to me. As it needs some discussion before can implement that.

Unix4ever commented 2 months ago

Another approach you can use here is to have a sidecar container for Omni that could watch Omni state, check which clusters no longer exist and clean up the dangling volumes in the CSI provisioner.

What you describe is too specific thing, Talos doesn't have means to do that.

Another more generic approach for that can be some hook that runs inside your cluster that can be called by Omni, then wait until this hook completes, and then destroy the cluster. But the hook implementation will be on you side again.

bernardgut commented 2 months ago

I like the "onClusterDeletion" hook approach. I am not an expert but it makes sense to me.

I think the other approach to this you mentioned in another thread which was "Mark some drives to be "owned" by omni/talos and wipe them entirely when running omnictl delete" was better. That would remove the need for hooks mechanics partially (for the general use-case). It is also easier to set up and makes a lot of sense: You have a declarative machine configuration in which you say This is the machine on which this OS will run, this is its network config, this is the kernel config, these are the extensions, and these are the block devices that will be used along with their config (if any). Everything declared here is immutable. This to me makes a whole lot of sense.