canonical / hardware-observer-operator

A charm to setup prometheus exporter for IPMI, RedFish and RAID devices from different vendors.
Apache License 2.0
7 stars 15 forks source link

Add basic alert rules based on smart metrics #238

Closed sudeephb closed 4 months ago

dashmage commented 4 months ago

I've made some changes + added new alerts for the attributes and more. Would love any suggestions/ corrections to wrap up this PR :smile:

@sudeephb @aieri @Pjack

dashmage commented 4 months ago

I've pushed a commit adding more fine-grained alert rules based on the exit status codes and critical warning attribute for NVMe devices.

dashmage commented 4 months ago

Apart from the tunable alert thresholds, I hope everything else is addressed now @aieri.

dashmage commented 4 months ago

Changes were migrated over to https://github.com/canonical/hardware-observer-operator/pull/244 due to merging troubles with this PR. Since the other PR is merged, will go ahead and close this.