Open sbueringer opened 2 weeks ago
/triage accepted
/cc @fabriziopandini @chrischdi @enxebre
But I think instead of relying on the LastTransitionTime we should have additional status fields tracking when we started to drain & when we started to wait for volume detach and then calculate if the timeouts are exceeded based on that.
+1
I think that we should add a status.Termination
struct with NodeDrainStartTime
& NodeVolumeDetachStartTime
fields. This will be somehow "consistent" with the Initialization
struct proposed by https://github.com/kubernetes-sigs/cluster-api/pull/10897.
/assign
One small note, wondering if we should call it status.deletion instead of termination. I think in CAPI we usually always talk about "deletion"
+ NodeVolumeDetachStartTime should be WaitForNodeVolumeDetachStartTime
No strong opinions about the name. Termination makes a good pair to Initialization, it also reminds of stuff like gracetermination period etc etc
Yup, we just use deletion in a lot of places and we don't have things like terminationGracePeriod for Machines (but we have a deletionTimestamp)
(we're also always talking about Machine deletion, never really used/heard Machine termination)
Today, we check if the NodeDrainTimeout & NodeVolumeDetachTimeout are exceeded by comparing time.Now with the LastTransitionTime of the corresponding conditions.
This is fragile for multiple reasons:
It would be already a good improvement if the following is implemented:
But I think instead of relying on the LastTransitionTime we should have additional status fields tracking when we started to drain & when we started to wait for volume detach and then calculate if the timeouts are exceeded based on that.
(Note: this is not relevant for NodeDeletionTimeout because for this one we use the deletionTimestamp field)