red-hat-storage / ocs-ci

https://ocs-ci.readthedocs.io/en/latest/
MIT License
108 stars 166 forks source link

make test_rbd_capacity_workload_alerts less strict #10059

Open DanielOsypenko opened 3 months ago

DanielOsypenko commented 3 months ago

We need to make test_rbd_capacity_workload_alerts less strict. For example if the test is running on 150Gb cluster and we fill up the storage to 97% we are failing to catch some alert messages, though more deep analisys of the logs is showing that we indeed have the alerts reported

{ "labels":{ "alertname":"CephClusterNearFull", "container":"mgr", "endpoint":"http-metrics", "instance":"10.131.0.27:9283", "job":"rook-ceph-mgr", "managedBy":"ocs-storagecluster", "namespace":"openshift-storage", "pod":"rook-ceph-mgr-a-7cddf8cc85-zttn6", "service":"rook-ceph-mgr", "severity":"warning" }, "annotations":{ "description":"Storage cluster utilization has crossed 75% and will become read-only at 85%. Free up some space or expand the storage cluster.", "message":"Storage cluster is nearing full. Data deletion or cluster expansion is required.", "runbook_url":"https://github.com/openshift/runbooks/blob/master/alerts/openshift-container-storage-operator/CephClusterNearFull.md", "severity_level":"warning", "storage_type":"ceph" }, "state":"firing", "activeAt":"2024-06-14T00:06:54.442675005Z", "value":"7.614045242468516e-01" }, { "labels":{ "alertname":"CephClusterNearFull", "container":"mgr", "endpoint":"http-metrics", "instance":"10.131.0.27:9283", "job":"rook-ceph-mgr", "managedBy":"ocs-storagecluster", "namespace":"openshift-storage", "pod":"rook-ceph-mgr-a-7cddf8cc85-zttn6", "service":"rook-ceph-mgr", "severity":"warning" }, "annotations":{ "description":"Storage cluster utilization has crossed 75% and will become read-only at 85%. Free up some space or expand the storage cluster.", "message":"Storage cluster is nearing full. Data deletion or cluster expansion is required.", "runbook_url":"https://github.com/openshift/runbooks/blob/master/alerts/openshift-container-storage-operator/CephClusterNearFull.md", "severity_level":"warning", "storage_type":"ceph" }, "state":"pending", "activeAt":"2024-06-14T00:06:54.442675005Z", "value":"7.509297976891199e-01" },

https://docs.google.com/document/d/17DL_nFLCA9QtyvtyQpBAxdr2wAb8prOw26LMv8zkNqA/edit?usp=sharing


RP: https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/678/22130/1054731/1054739/log

github-actions[bot] commented 4 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs.