Open beyang opened 5 years ago
https://cloud.google.com/kubernetes-engine/docs/how-to/maintenance-window
We get to choose a maintenance window and @ggilmore currently has it set to SF timezone so the on-call person is likely awake during that time.
These upgrades are not things we should take lightly, they contain important security fixes. So, if we disable these, what is our plan to ensure we are on-top of upgrades and rolling them out in a similar timeframe (<24hr)?
The maintenance window is daily, which might be too frequent for us. What do you think about doing it weekly or even monthly. Critical updates will still be applied automatically:
GKE reserves the right to roll out unplanned, emergency upgrades outside of maintenance windows. Additionally, mandatory upgrades to upgrade from deprecated or outdated software might automatically occur outside of maintenance windows.
Whatever we do, the on-call person must be made aware of the maintenance window (whether manual or automatic), so they are not caught off-guard.
Disable node auto-upgrades for sourcegraph.com and implement a manual upgrade schedule. Auto-upgrades result in downtime (most recently: https://sourcegraph.slack.com/archives/C0J618TTM/p1562100477015600, which resulted in 7 minutes of downtime). This is disruptive to the person on-call and results in a poor experience for users.
@ggilmore can we disable auto-upgrades and move to a manual update schedule?