openshift / svt

Apache License 2.0
124 stars 105 forks source link

check_operators checks operator progressing or unavailable or degraded #712

Closed qiliRedHat closed 2 years ago

qiliRedHat commented 2 years ago

To fix the error notification of Operator degraded when operator is actually progressing

[2022-05-06 09:50:09 UTC] @mffiedler, :boom: Operator degraded: kube-apiserverTrueTrueFalseNodeInstallerProgressing:

qiliRedHat commented 2 years ago

Tested with 'oc scale --replicas=0 -n openshift-ingress deploy/router-default'

Slack outputs: info for progressing:

[2022-05-07 18:22:10 CST] @qili, Operator progressing: ingress 4.11.0-0.nightly-2022-05-06-180112 False True True 22s The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.)

error for unavailable:

[2022-05-07 18:22:12 CST] @qili, :boom: Operator unavailable or degraded: authentication 4.11.0-0.nightly-2022-05-06-180112 False False False 65s OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.qili-411-sdn.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) console 4.11.0-0.nightly-2022-05-06-180112 False False False 63s RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.qili-411-sdn.qe.devcluster.openshift.com/): Get "https://console-openshift-console.apps.qili-411-sdn.qe.devcluster.openshift.com/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

qiliRedHat commented 2 years ago

@mffiedler PTAL

mffiedler commented 2 years ago

Nice - good additional heath check.