prometheus / node_exporter

Exporter for machine metrics
https://prometheus.io/
Apache License 2.0
11.2k stars 2.36k forks source link

Feature request: expose ioerr_cnt metrics #2231

Open toshipp opened 2 years ago

toshipp commented 2 years ago

In Linux, the error counter for block devices can be retrieved from /sys/block/*/device/ioerr_cnt, but the current diskstats controller does not support them.

My use case that uses this counter is to detect if an HDD is broken or not.

SuperQ commented 2 years ago

Thanks, in order to add this, we need parsing support for /sys/block/*/device/... added to https://github.com/prometheus/procfs.

toshipp commented 2 years ago

Just to confirm, that library handles procfs from the name, but this counter is provided by sysfs. Is it appropriate to implement it in that library?

SuperQ commented 2 years ago

Correct, the procfs library handles parsing of both procfs and sysfs files.

toshipp commented 2 years ago

Understood.

scottlaird commented 2 years ago

FWIW, I was considering adding this for my own use, and discovered that the bulk of the ioerr_cnt issues were actually smartd requesting things that the device didn't support. So it wasn't as good of a signal as I'd hoped for.

I'm planning on collecting SAS link metrics in #2386; this may or may not be helpful for your case.