Closed Preisschild closed 7 months ago
Due to the architecture of CAPI, there's no way Talos node knows it is going to be removed (it might work for controlplane nodes, but not for worker nodes). Dead members will be cleaned up after 30 minutes. Dead Kubespan peers should not cause issues if the IPs don't overlap.
Ah, that's not great. Unfortunately Hetzner Cloud often reassigns the same IP to a new node which means that those nodes often require ~30mins before they are ready.
Not a major issue, but sure is annoying.
One way - not a great one, but as a workaround, is to call reset
on the node being removed. I think CAPI provides a set of webhooks which can be used for that, but that's not an easy fix.
This might be possible using the pre-terminate
hook
When nodes are removed (while doing a rollingUpgrade, for example) they are still in the
members
andkubespanpeerspecs
resources, and thus kubespan still tries to connect to them.This PR doesn't seem to fix this behavior in CAPI.