Sometimes while deleting the machines in the cluster during the scaling down[1], the final machine (particularly control plane machine) gets stuck in Deleting phase. This can be traced back to deleting the node the machine is associated with, without passing on the nodeRef. This hinders with the deletion of machine as there is no reference to the node when it finally comes to its proper cleanup.
This doesn't happens often, so most probably a race condition. I tested and noted this on OpenStack so we need to dig in a bit more into this and test it out a bit more on different clouds and try and reproduce it.
Summary
Sometimes while deleting the machines in the cluster during the scaling down[1], the final machine (particularly control plane machine) gets stuck in
Deleting
phase. This can be traced back to deleting the node the machine is associated with, without passing on thenodeRef
. This hinders with the deletion of machine as there is no reference to the node when it finally comes to its proper cleanup.This doesn't happens often, so most probably a race condition. I tested and noted this on OpenStack so we need to dig in a bit more into this and test it out a bit more on different clouds and try and reproduce it.