munin-monitoring / munin

Main repository for munin master / node / plugins
http://munin-monitoring.org
Other
1.99k stars 474 forks source link

plugins/node.d/ipmi_sensor_.in needs cleanup #301

Closed szepeviktor closed 6 years ago

szepeviktor commented 10 years ago

I think munin cannot handle "lower then X" warnings. It can.

Maybe I will be the one doing it.

Current (non-conform) config example

/etc/munin/ipmi Note: not in plugin-conf.d

rpm = CPU FAN, SYSTEM FAN
volts = System 12V, System 5V, System 3.3V, CPU0 Vcore, System 1.25V, System 1.8V, System 1.2V
degrees_c = CPU0 Dmn 0 Temp
szepeviktor commented 10 years ago

https://github.com/munin-monitoring/munin/blob/devel/plugins/node.d/ipmi_sensor_.in#L255-L256

# TODO add 'fans'
if 'rpm'==unit:
    warn = "%s:%s" % (warn_u,warn_l)
    crit = "%s:%s" % (crit_u,crit_l)
else:
    warn = "%s:%s" % (warn_l,warn_u)
    crit = "%s:%s" % (crit_l,crit_u)

Results

cpu_fan.label CPU FAN
cpu_fan.warning :4859.086
cpu_fan.critical 937.383:4960.317
steveschnepp commented 10 years ago

... well, only 1 item is done ...

szepeviktor commented 10 years ago

Here is the TODO: https://github.com/munin-monitoring/munin/issues/301#issue-48276062 This could be my first munin plugin-rewrite.

steveschnepp commented 10 years ago

feel free :-)

szepeviktor commented 10 years ago

Please suggest me the best simple python plugin you know.

steveschnepp commented 10 years ago

simple and python... well :grinning:

Just start with one that you use, and doesn't do what you want.

leeclemens commented 9 years ago

It seems both ipmi_ and ipmi_sensor use ipmitool (plugins written in bash and python, respectively). Should they be merged (regardless of language)?

steveschnepp commented 9 years ago

@leeclemens it sounds like a good idea, yes.

The less plugins, the more multigraph they are, the merrier.

sumpfralle commented 6 years ago

@leeclemens: are you still interested in merging them?

ghost commented 6 years ago

This change broke my ipmi_sensor_u_rpm by reversing the critical min:max values.

I have a couple different SuperMicro boards and they report low RPM in the 300's and high in the 20,000 range. I suspect that Viktor Szép has different hardware with a bug somewhere that reports the range incorrectly.

Here's the output of ipmitool sensor get 'FAN 1':

Locating sensor record...
Sensor ID              : FAN 1 (0x41)
 Entity ID             : 29.1
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 4800 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : 300.000
 Lower Critical        : 450.000
 Lower Non-Critical    : 600.000
 Upper Non-Critical    : 18975.000
 Upper Critical        : 19050.000
 Upper Non-Recoverable : 19125.000
 Positive Hysteresis   : 75.000
 Negative Hysteresis   : 75.000
 Assertion Events      : 
 Assertions Enabled    : lcr- lnr- unc+ ucr+ unr+ 
 Deassertions Enabled  : lcr- lnr- unc+ ucr+ unr+ 

Here's the output of _munin-run ipmi_sensor_urpm config after the patch: fan_1.critical 19050.000:450.000

sumpfralle commented 5 years ago

Indeed the change introduced by the above commit (68eaf22a0e368dbd509c5abad451090e03e6ee9b) seems to be suitable only for a specific (broken) device. For the other devices it erroneously reverses the lower and upper critical value.

See also https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914156.

@szepeviktor: maybe a firmware upgrade already fixed the issue for you? Or maybe ipmitool could work around the bad output of your device? I would like to avoid introducing device-specific workarounds into munin's plugins, but rather fix the data sources.