We need to make test_rbd_capacity_workload_alerts less strict. For example if the test is running on 150Gb cluster and we fill up the storage to 97% we are failing to catch some alert messages, though more deep analisys of the logs is showing that we indeed have the alerts reported
{
"labels":{
"alertname":"CephClusterNearFull",
"container":"mgr",
"endpoint":"http-metrics",
"instance":"10.131.0.27:9283",
"job":"rook-ceph-mgr",
"managedBy":"ocs-storagecluster",
"namespace":"openshift-storage",
"pod":"rook-ceph-mgr-a-7cddf8cc85-zttn6",
"service":"rook-ceph-mgr",
"severity":"warning"
},
"annotations":{
"description":"Storage cluster utilization has crossed 75% and will become read-only at 85%. Free up some space or expand the storage cluster.",
"message":"Storage cluster is nearing full. Data deletion or cluster expansion is required.",
"runbook_url":"https://github.com/openshift/runbooks/blob/master/alerts/openshift-container-storage-operator/CephClusterNearFull.md",
"severity_level":"warning",
"storage_type":"ceph"
},
"state":"firing",
"activeAt":"2024-06-14T00:06:54.442675005Z",
"value":"7.614045242468516e-01"
},
{
"labels":{
"alertname":"CephClusterNearFull",
"container":"mgr",
"endpoint":"http-metrics",
"instance":"10.131.0.27:9283",
"job":"rook-ceph-mgr",
"managedBy":"ocs-storagecluster",
"namespace":"openshift-storage",
"pod":"rook-ceph-mgr-a-7cddf8cc85-zttn6",
"service":"rook-ceph-mgr",
"severity":"warning"
},
"annotations":{
"description":"Storage cluster utilization has crossed 75% and will become read-only at 85%. Free up some space or expand the storage cluster.",
"message":"Storage cluster is nearing full. Data deletion or cluster expansion is required.",
"runbook_url":"https://github.com/openshift/runbooks/blob/master/alerts/openshift-container-storage-operator/CephClusterNearFull.md",
"severity_level":"warning",
"storage_type":"ceph"
},
"state":"pending",
"activeAt":"2024-06-14T00:06:54.442675005Z",
"value":"7.509297976891199e-01"
},
We need to make test_rbd_capacity_workload_alerts less strict. For example if the test is running on 150Gb cluster and we fill up the storage to 97% we are failing to catch some alert messages, though more deep analisys of the logs is showing that we indeed have the alerts reported
{ "labels":{ "alertname":"CephClusterNearFull", "container":"mgr", "endpoint":"http-metrics", "instance":"10.131.0.27:9283", "job":"rook-ceph-mgr", "managedBy":"ocs-storagecluster", "namespace":"openshift-storage", "pod":"rook-ceph-mgr-a-7cddf8cc85-zttn6", "service":"rook-ceph-mgr", "severity":"warning" }, "annotations":{ "description":"Storage cluster utilization has crossed 75% and will become read-only at 85%. Free up some space or expand the storage cluster.", "message":"Storage cluster is nearing full. Data deletion or cluster expansion is required.", "runbook_url":"https://github.com/openshift/runbooks/blob/master/alerts/openshift-container-storage-operator/CephClusterNearFull.md", "severity_level":"warning", "storage_type":"ceph" }, "state":"firing", "activeAt":"2024-06-14T00:06:54.442675005Z", "value":"7.614045242468516e-01" }, { "labels":{ "alertname":"CephClusterNearFull", "container":"mgr", "endpoint":"http-metrics", "instance":"10.131.0.27:9283", "job":"rook-ceph-mgr", "managedBy":"ocs-storagecluster", "namespace":"openshift-storage", "pod":"rook-ceph-mgr-a-7cddf8cc85-zttn6", "service":"rook-ceph-mgr", "severity":"warning" }, "annotations":{ "description":"Storage cluster utilization has crossed 75% and will become read-only at 85%. Free up some space or expand the storage cluster.", "message":"Storage cluster is nearing full. Data deletion or cluster expansion is required.", "runbook_url":"https://github.com/openshift/runbooks/blob/master/alerts/openshift-container-storage-operator/CephClusterNearFull.md", "severity_level":"warning", "storage_type":"ceph" }, "state":"pending", "activeAt":"2024-06-14T00:06:54.442675005Z", "value":"7.509297976891199e-01" },
https://docs.google.com/document/d/17DL_nFLCA9QtyvtyQpBAxdr2wAb8prOw26LMv8zkNqA/edit?usp=sharing
RP: https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/678/22130/1054731/1054739/log