Create alert for blocked machineconfig updates

As discussed in #823, pending machineconfig updates can be dangerous. A minor change (such as removing a node taint or a user changing the pod disruption budget for their application) could unblock a pending update, causing an unexpected reboot of one or more cluster nodes.

We need an alert that fires if a machineconfigpool is "updating" for more than a reasonable amount of time.

The production cluster is in this state right now:

$ kubectl get mcp worker -o custom-columns='NAME:.metadata.name,UPDATING:.status.conditions[?(@.type=="Updating")].status,SINCE:.status.conditions[?(@.type=="Updating")].lastTransitionTime'
NAME     UPDATING   SINCE
worker   True       2024-11-13T15:25:34Z

nerc-project / operations

Create alert for blocked machineconfig updates #824