thomas-krenn / check_ipmi_sensor_v3

Monitoring plugin to check IPMI sensors
https://www.thomas-krenn.com/en/wiki/IPMI_Sensor_Monitoring_Plugin
GNU General Public License v3.0
54 stars 20 forks source link

Unknown iDrac problem #53

Closed ShowMeYourSkil closed 2 years ago

ShowMeYourSkil commented 2 years ago

Hi everyone In my Icinga I am monitoring an IDRAC server with an ipmi_check_sensor.

Since yesterday this check is set to unknown all CPU kernels are declared as unknown and I get the following error:

ipmi_fru_multirecord_power_supply_information: invalid parameters
ipmi_sdr_cache_iterate: error returned in callback

→ Execution of /usr/sbin/ipmi-fru failed with return code 1.
→ /usr/sbin/ipmi-fru was executed with the following parameters:
/usr/sbin/ipmi-fru -h IP-Address -u SNMP-USER -p SNMP-PASSWORD -l user --driver-type=LAN_2_0 -

I almost believe that the Dell API key has expired behind this, but unfortunately I have not found the config file where this is displayed. Can you help me with this problem?

Best regards

veitw commented 2 years ago

Hi,

I am not familiar with iDRAC kind of IPMI BMCs, but I am confident that for any IPMI communication, also for Dell, there is no API key involved at all.

I have seen occasions of similar problems with BMCs from different manufacturers, and with exception of one dead BMC it was always a fixable issue caused by either the system needed a reboot with full POST for the BMC to (re)detect all the hardware stati, or the BMC got confused and needed a reboot itself.

I'd try the latter first.

Best regards, // Veit

ShowMeYourSkil commented 2 years ago

Hi, by now all my dell servers are affected by this problem. The icinga2 status is unknown the error is as mentioned above or not found.

gschoenberger commented 2 years ago

As the problem is an invalid return code from ipmi-fru, can you try to run the plugin without "--fru"?