canonical / hardware-observer-operator

A charm to setup prometheus exporter for IPMI, RedFish and RAID devices from different vendors.
Apache License 2.0
7 stars 14 forks source link

Remove Redfish HealthNotAvailable Alerts #184

Closed sudeephb closed 4 months ago

sudeephb commented 4 months ago

These cause a lot of alerts with no solutions. In some cases, it is expected for sensors to not have health info. In other cases, where it may make sense to have the health info, there is nothing the operator can do if redfish itself isn't providing the Health info and providing N/A. So, the alert cannot be resolved.

Closes: #165

sudeephb commented 4 months ago

LGTM, but just to double confirm, are all these alerts not useful or create false alarms?

Sort of both. The issue #165 mentions about false positives. And since we get no additional info from redfish(which caused HealthNotAvailable to be fired in the first place), there's no way to 'fix' these alerts. And Health info is not available, the alerts will be fired both when health is good or bad, we just don't know because redfish didn't give the health info to us. In this sense, these alerts are not very useful.