keikoproj / lifecycle-manager

Graceful AWS scaling event on Kubernetes using lifecycle hooks
Apache License 2.0
94 stars 28 forks source link

Reimplement retry-interval when draining nodes #188

Open omgrr opened 8 months ago

omgrr commented 8 months ago

This addresses https://github.com/keikoproj/lifecycle-manager/issues/185 by re-adding the retryInterval when draining.

This also re-adds the retry-go package in order to do both the retry and the delay functionality. If there was another reason for removing it let me know!

Testing this on an actual cluster with setting the drain-interval to 30 seconds and the drain-timeout to 120 seconds I can see from the logs that there is the correct amount of time between retries.

time="2024-01-12T19:32:06Z" level=info msg="retrying drain, node <node name>"
time="2024-01-12T19:34:36Z" level=info msg="retrying drain, node <node name>"
# 2 minutes 30 seconds between retries