thomas-krenn / check_ipmi_sensor_v3

Monitoring plugin to check IPMI sensors
https://www.thomas-krenn.com/en/wiki/IPMI_Sensor_Monitoring_Plugin
GNU General Public License v3.0
54 stars 21 forks source link

Cannot pass extra options to ipmi-sel #27

Closed aieri closed 5 years ago

aieri commented 5 years ago

I have a server with a bunch of entries in the SEL that don't play nice with ipmi-sel unless I use the --system-event-records option. Compare:

$ ipmi-sel
ID | Date        | Time     | Name            | Type         | Event
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
Unknown SEL Record Type: 0h
15 | Oct-18-2018 | 09:04:54 | Sensor #8       | Power Supply | Power Supply Failure detected
16 | Oct-18-2018 | 09:04:54 | Sensor #10      | Power Supply | Power Supply Failure detected
$ ipmi-sel --assume-system-event-records
ID | Date        | Time     | Name            | Type         | Event
1  | Sep-21-2013 | 12:43:56 | Sensor #8       | Power Supply | Power Supply Failure detected
2  | Sep-21-2013 | 12:43:56 | Sensor #10      | Power Supply | Power Supply Failure detected
3  | Nov-21-2013 | 15:22:53 | Sensor #7       | Power Supply | Power Supply Failure detected
4  | Nov-21-2013 | 15:22:53 | Sensor #9       | Power Supply | Power Supply Failure detected
5  | Nov-23-2013 | 13:14:08 | Sensor #7       | Power Supply | Power Supply Failure detected
6  | Nov-23-2013 | 13:14:08 | Sensor #9       | Power Supply | Power Supply Failure detected
7  | Aug-11-2014 | 12:23:51 | Sensor #7       | Power Supply | Power Supply Failure detected
8  | Aug-11-2014 | 12:23:51 | Sensor #9       | Power Supply | Power Supply Failure detected
9  | Jan-19-2015 | 09:35:59 | Sensor #8       | Power Supply | Power Supply Failure detected
10 | Jan-19-2015 | 09:35:59 | Sensor #10      | Power Supply | Power Supply Failure detected
11 | Jan-20-2015 | 08:59:55 | Sensor #7       | Power Supply | Power Supply Failure detected
12 | Jan-20-2015 | 08:59:55 | Sensor #9       | Power Supply | Power Supply Failure detected
13 | Feb-01-2017 | 20:46:22 | Sensor #8       | Power Supply | Power Supply Failure detected
14 | Feb-01-2017 | 20:46:22 | Sensor #10      | Power Supply | Power Supply Failure detected
15 | Oct-18-2018 | 09:04:54 | Sensor #8       | Power Supply | Power Supply Failure detected
16 | Oct-18-2018 | 09:04:54 | Sensor #10      | Power Supply | Power Supply Failure detected

Those output errors cause errors in the check script:

$ check_ipmi_sensor --selonly
SEL Status: Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 981.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Use of uninitialized value in string ne at ./check_ipmi_sensor line 978.
Critical [16 system event log (SEL) entries present] | 'Current Power'=244

Option -O does work, but only for passing options to ipmi-sensors. I suggest creating independent 'extra-flags' options for both ipmi-sensors and ipmi-sel.

gschoenberger commented 5 years ago

Can you test commit 8273dbce4c0c85174a7a9c2cae3ee3423584fab8? Please use: --seloptions '--system-event-records'

aieri commented 5 years ago

Wow that was fast! Thanks!

The new commit does work:

$ check_ipmi_sensor --seloptions --assume-system-event-records
IPMI Status: Critical [16 system event log (SEL) entries present] | 'Current Power'=244 'Temp 1'=34.00;;~:42.00 'Temp 2 (CPU 1)'=52.00;;~:81.00 'Temp 3 (CPU 2)'=43.00;;~:81.00 'Temp 4 (MemD1)'=61.00;;~:87.00 'Temp 5 (MemD2)'=56.00;;~:87.00 'Temp 7 (IIOH)'=70.00;;~:105.00 'Temp 8 (PCIR)'=51.00;;~:85.00 'Temp 9 (PCIR)'=49.00;;~:85.00 'Temp 10 (PCIR)'=40.00;;~:70.00 'Temp 11 (PCIR)'=42.00;;~:65.00 'Temp 12 (PCIR)'=50.00;;~:75.00 'Temp 13 (PCIR)'=53.00;;~:87.00 'Temp 14 (PCIR)'=43.00;;~:65.00 'Temp 15 (IOH2)'=0.00;;~:105.00

I have however added a couple of in-line comments to the actual commit, as there are some typos in the documentation.

gschoenberger commented 5 years ago

THX for the feedback! Fixing typos with 268eb36f0c20b060fd04f4c9d1ff1fcd17d3269f Cheers, Georg