BCDevOps / OpenShift4-RollOut

This is the primary board for all activities related to the roll out of OpenShift 4
Apache License 2.0
0 stars 2 forks source link

OCP4.6 - Test new version alerts #446

Closed StevenBarre closed 3 years ago

StevenBarre commented 3 years ago

Describe the issue OCP4.6 now generates alerts when new versions are available in the channel your cluster is on. We need to test how these alerts work and who gets notified. We don't want after hours "critical" alerts generated for on-call when a new version is released to stable.

Which Sprint Goal is this issue related to?

Additional context This will need to be done AFTER the upgrade and a new release is available post upgrade.

Definition of done Checklist (where applicable)

StevenBarre commented 3 years ago

Subject: [FIRING:1] UpdateAvailable 1.2.3.4:9099 (stable-4.6 metrics cluster-version-operator openshift-cluster-version cluster-version-operator-748d5ddbbf-9tg94 openshift-monitoring/k8s cluster-version-operator info https://api.openshift.com/api/upgrades_...

[1] Firing 
Labels
alertname = UpdateAvailable
channel = stable-4.6
endpoint = metrics
instance = 1.2.3.4:9099
job = cluster-version-operator
namespace = openshift-cluster-version
pod = cluster-version-operator-748d5ddbbf-9tg94
prometheus = openshift-monitoring/k8s
service = cluster-version-operator
severity = info
upstream = https://api.openshift.com/api/upgrades_info/v1/graph
Annotations
message = Your upstream update recommendation service recommends you update your cluster. For more information refer to 'oc adm upgrade' or https://console.apps.klab.devops.gov.bc.ca/settings/cluster/.

Sent by AlertManager
StevenBarre commented 3 years ago

Abandoning the 4.6 upgrade for now to focus on SDN issues. Moving to backlog.

This was solved with a sub-route to the warning/info route, but never committed to git.

    - receiver: Warning
      match_re:
        severity: warning|info
      routes:
        - match:
            alertname: UpdateAvailable
          repeat_interval: 1w
StevenBarre commented 3 years ago

Added to https://github.com/bcgov-c/platform-ops/pull/351/commits/bdeaf427305bada52fd0907d6371be51d424a697

StevenBarre commented 3 years ago

Updated config will get rolled out during quarterly patching.