canonical / hardware-observer-operator

A charm to setup prometheus exporter for IPMI, RedFish and RAID devices from different vendors.
Apache License 2.0
9 stars 15 forks source link

smartctl_devices metric (and others) are often incorrect #283

Open aieri opened 2 months ago

aieri commented 2 months ago

In a standard server with a hardware HBA, the devices count is generally incorrect. This then causes the missing devices count to also be wrong. This issue is visible in the default documentation screenshot, which was pulled from a server in our lab that has 4 block devices (thankfully the screenshot itself isn't necessarily wrong - it could be displaying the correct output of a different server).

This is due to https://github.com/prometheus-community/smartctl_exporter/issues/236; we could work around this by performing some level of hardware autodetection in the charm layer and configuring the exporter appropriately. Switching to a different exporter is also potentially an option.

aieri commented 10 hours ago

additionally, I don't think the way we calculate missing devices makes sense, that difference should always be 0. We should consider dropping the panel.