keikoproj / upgrade-manager

Reliable, extensible rolling-upgrades of Autoscaling groups in Kubernetes
Apache License 2.0
140 stars 45 forks source link

Early cordon #405

Closed shreyas-badiger closed 8 months ago

shreyas-badiger commented 9 months ago

Currently, upgrade-manager supports 2 different strategies: Eager mode - Eagerly wait for replacement nodes and only then drain & terminate the previous instances. Lazy mode - Rotate (drain and terminate) the desired number of nodes without waiting for the replacement nodes.

In these two strategies, we cordon only the nodes that are in the current batch (batch size is determined by maxUnavailable mentioned in the RollingUpgrade CR. By default maxUnavailable=1)

While the upgrade is in progress, the remaining older nodes that are not yet considered in the node-rotation batch, might have newer deployments / pods scheduled.

These newly scheduled pods could have yet another restart when the underlying older nodes are considered for rotation. There is also an added time for draining these nodes with additional new pods.

With the approach in PR, we will cordon all the nodes in the respective IG when a rollingUpgrade CR is being processed. The newer pods will always scheduled on newer nodes when an upgrade is in progress.

codecov[bot] commented 9 months ago

Codecov Report

Attention: 33 lines in your changes are missing coverage. Please review.

Comparison is base (1201813) 39.09% compared to head (5c6287d) 43.75%.

Files Patch % Lines
controllers/upgrade.go 37.77% 26 Missing and 2 partials :warning:
controllers/rollingupgrade_controller.go 0.00% 5 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #405 +/- ## ========================================== + Coverage 39.09% 43.75% +4.65% ========================================== Files 7 7 Lines 931 1104 +173 ========================================== + Hits 364 483 +119 - Misses 540 575 +35 - Partials 27 46 +19 ``` | [Flag](https://app.codecov.io/gh/keikoproj/upgrade-manager/pull/405/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=keikoproj) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/keikoproj/upgrade-manager/pull/405/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=keikoproj) | `43.75% <34.00%> (+4.65%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=keikoproj#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.