galexrt / dellhw_exporter

Prometheus exporter for Dell Hardware components using Dell OMSA.
https://dellhw-exporter.galexrt.moe
Apache License 2.0
119 stars 41 forks source link

Running v1.13.12 against iSM #121

Closed p3lim closed 1 month ago

p3lim commented 4 months ago

Hey,

Nice little exporter, tried running it against iSM 5.3.0 / iSM OSC 7.3.0 on iDRAC 9 v7.00.00.00, and here are my findings:

All other metrics seems to work fine! :+1:

eugene-marchanka commented 2 months ago

running into the same issue with iDrac v7:

🌵 k exec -it dellhw-exporter-4nbzr -- bash
[root@dellhw-exporter-4nbzr /]# /usr/libexec/instsvcdrv-helper status
/usr/libexec/instsvcdrv-helper: line 519: lsmod: command not found
/usr/libexec/instsvcdrv-helper: line 519: lsmod: command not found
/usr/libexec/instsvcdrv-helper: line 519: lsmod: command not found
/usr/libexec/instsvcdrv-helper: line 519: lsmod: command not found
/usr/libexec/instsvcdrv-helper: line 519: lsmod: command not found
[root@dellhw-exporter-4nbzr /]# lsmod | grep -iE 'dell|dsu'
bash: lsmod: command not found

How can I fix it?

galexrt commented 2 months ago

The lsmod: command not found can be ignored, as the Dell OMS service startup script tries to check if the necessary modules are there/need to modprobed.

I'll take a look at the dell_hw_chassis_temps metric reporting 0 soon.

galexrt commented 1 month ago

@eugene-marchanka The dell_hw_chassis_temps metric is reporting the "status" as a number, e.g., 0 = "OK", see for a section that explains the status numbers https://github.com/galexrt/dellhw_exporter/blob/main/docs/metrics.md#what-do-the-metrics-mean (The docs site is currently broken which I'm looking into soon)

The chassis_temps_reading is the one that reports the actual temperature reading.

eugene-marchanka commented 1 month ago

@eugene-marchanka The dell_hw_chassis_temps metric is reporting the "status" as a number, e.g., 0 = "OK", see for a section that explains the status numbers https://github.com/galexrt/dellhw_exporter/blob/main/docs/metrics.md#what-do-the-metrics-mean (The docs site is currently broken which I'm looking into soon)

The chassis_temps_reading is the one that reports the actual temperature reading.

Thanks @galexrt ! I successfully installed dellhw_exporter to our qa/dev environments and ready to roll it out to prod systems 👍🏻

galexrt commented 1 month ago

Great to hear!

I think as the issue itself stands, the questions/points have been answered so I'm going to close the issue.

I might revisit the issue in the future to add the dell_hw_chassis_temps question to a FAQ section.