dangmocrang / check_idrac

A script to monitoring DELL IDRAC via SNMP
Other
74 stars 53 forks source link

ValueError: invalid literal for int() with base 10: 'Bad' #37

Open AH34311 opened 6 years ago

AH34311 commented 6 years ago

Hi

When running the latest version of this script 2,2rc4 with a simple "./check_idrac -H x.x.x.x -v2c -c public" the output looks good until the end where it displays the following error at the command line: PS --PS 1: OK, Volt I/O: 264 V/(N/A) V, Current: 0.1 A, Watt I/O: 1260.0 W/1100 W --PS 2: OK, Volt I/O: 264 V/(N/A) V, Current: 0.0 A, Watt I/O: 1260.0 W/1100 W DISK --PDisk 1 (0:1:0) 1117.25 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [HGST, HDD, S/N: 0EGV1HGF] --PDisk 2 (0:1:1) 1117.25 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [HGST, HDD, S/N: 0EGV79GF] --PDisk 3 (0:1:2) 1117.25 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [HGST, HDD, S/N: 0EGV6BXF] --PDisk 4 (0:1:3) 1117.25 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [HGST, HDD, S/N: 0EGV4X7F] FAN --System Board Fan1: 1440 RPM - ENABLED/OK BATTERY --System Board CMOS Battery: ENABLED/OK [PRESENCEDETECTED] --PERC ROMB Battery: ENABLED/OK [PRESENCEDETECTED] PU --PU 1: ENABLED/OK, RedundancyStatus: FULL, SystemBoard Pwr Consumption: 84 W MEM --Memory 1 (DIMM Socket A1) 8.0 GB/2400 MHz: ENABLED/OK [26, Micron Technology, S/N: 12E5CCF2] --Memory 2 (DIMM Socket A2) 8.0 GB/2400 MHz: ENABLED/OK [26, Micron Technology, S/N: 12E5CEB0] --Memory 3 (DIMM Socket A3) 8.0 GB/2400 MHz: ENABLED/OK [26, Micron Technology, S/N: 12E5CE55] --Memory 4 (DIMM Socket A4) 8.0 GB/2400 MHz: ENABLED/OK [26, Micron Technology, S/N: 12E5CCBA] VDISK --VDisk 1 (DATA): OK/ONLINE, RAID-10 (2234.5 GB), BadBlock: 0 [Virtual Disk 0 on RAID Controller in Slot 3] Traceback (most recent call last): File "./check_idrac", line 847, in result, tmp_code = PARSER().main() File "./check_idrac", line 643, in main hw_dict = self.classifier(snmp_data, hw_dict) # classify data File "./check_idrac", line 412, in classifier itemorder = int(.split()[0].split('.')[-1]) ValueError: invalid literal for int() with base 10: 'Bad'

I believe as a result when this is implemented within Nagios I get a Status of WARNING. Individually when all items are checked rather than all at once - the output is OK.

dangmocrang commented 6 years ago

ah... maybe something wrong in global call...

asdorsey commented 6 years ago

Also seeing this issue:

# python idrac_2.2rc4 -H 10.181.4.13 -v 2c -c public
PS
--
DISK
--PDisk 1 (0:1:0) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: Z1Z54QGG] isFailing: 0
--PDisk 2 (0:1:1) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: Z1Z550ZP] isFailing: 0
--PDisk 3 (0:1:2) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: Z1Z54PW8] isFailing: 0
--PDisk 4 (0:1:3) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: Z1Z54HCY] isFailing: 0
--PDisk 5 (0:1:4) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: Z1Z550PM] isFailing: 0
--PDisk 6 (0:1:5) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: Z1Z54P2T] isFailing: 0
--PDisk 7 (0:1:6) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: 67NEKEHLFVLC] isFailing: 0
--PDisk 8 (0:1:7) 3725.5 GB: ONLINE, PowerStat: SPUNUP, HotSpare: no [ATA, HDD, S/N: Z1Z5510K] isFailing: 0
FAN
--
BATTERY
--System Board CMOS Battery: ENABLED/OK [PRESENCEDETECTED]
--PERC1 ROMB Battery: ENABLED/OK [PRESENCEDETECTED]
--PERC2 ROMB Battery: ENABLED/OK [0]
PU
--
MEM
--
VDISK
--VDisk 1 (Virtual Disk 0): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 0 on Integrated RAID Controller 1]
--VDisk 2 (Virtual Disk 1): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 1 on Integrated RAID Controller 1]
--VDisk 3 (Virtual Disk 2): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 2 on Integrated RAID Controller 1]
--VDisk 4 (Virtual Disk 3): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 3 on Integrated RAID Controller 1]
--VDisk 5 (Virtual Disk 4): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 4 on Integrated RAID Controller 1]
--VDisk 6 (Virtual Disk 5): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 5 on Integrated RAID Controller 1]
--VDisk 7 (Virtual Disk 6): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 6 on Integrated RAID Controller 1]
--VDisk 8 (Virtual Disk 7): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 7 on Integrated RAID Controller 1]
--VDisk 9 (Virtual Disk 8): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 8 on Integrated RAID Controller 1]
--VDisk 10 (Virtual Disk 9): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 9 on Integrated RAID Controller 1]
--VDisk 11 (Virtual Disk 10): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 10 on Integrated RAID Controller 1]
--VDisk 12 (Virtual Disk 11): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 11 on Integrated RAID Controller 1]
--VDisk 13 (Virtual Disk 12): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 12 on Integrated RAID Controller 1]
--VDisk 14 (Virtual Disk 13): OK/ONLINE, RAID-5 (1843.2 GB), BadBlock: 0 [Virtual Disk 13 on Integrated RAID Controller 1]
--VDisk 15 (Virtual Disk 14): OK/ONLINE, RAID-5 (273.71 GB), BadBlock: 0 [Virtual Disk 14 on Integrated RAID Controller 1]
Traceback (most recent call last):
  File "idrac_2.2rc4", line 848, in <module>
    result, tmp_code = PARSER().main()
  File "idrac_2.2rc4", line 644, in main
    hw_dict = self.classifier(snmp_data, hw_dict)  # classify data
  File "idrac_2.2rc4", line 413, in classifier
    item_order = int(_.split()[0].split('.')[-1])
ValueError: invalid literal for int() with base 10: 'MIB-Dell-10892::systemStateGlobalSystemStatus'

Edit: There was a debug statement directly before the part of the script where the error occurred. I uncommented it, and got some good information:

# python idrac_2.2rc4 -H 10.181.4.13 -v 2c -c public
...
matched: MIB-Dell-10892::systemStateGlobalSystemStatus No Such Object available on this agent at this OID
Traceback (most recent call last):
  File "idrac_2.2rc4", line 848, in <module>
    result, tmp_code = PARSER().main()
  File "idrac_2.2rc4", line 644, in main
    hw_dict = self.classifier(snmp_data, hw_dict)  # classify data
  File "idrac_2.2rc4", line 413, in classifier
    item_order = int(_.split()[0].split('.')[-1])
ValueError: invalid literal for int() with base 10: 'MIB-Dell-10892::systemStateGlobalSystemStatus'

It looks like this may be an issue with the MIB. I will investigate more.

asdorsey commented 6 years ago

I managed to work around the issue. The following patch resolves the problem:

--- idrac_2.2rc4        2018-04-23 09:06:36.000000000 +0000
+++ idrac_2.2rc4.new    2018-04-26 16:35:02.758174688 +0000
@@ -410,8 +410,11 @@
         for _ in data:
             if item.search(_):
                 #--debug print 'matched:', _
-                item_order = int(_.split()[0].split('.')[-1])
-                item_info = ' '.join(_.split()[1:])
+                try:
+                    item_order = int(_.split()[0].split('.')[-1])
+                    item_info = ' '.join(_.split()[1:])
+                except ValueError:
+                    continue
                 if self.hardware[2] == 'PS':
                     if 'voltageProbeReading' in _:
                         item_order -= 25  # ps volt starting with number 26

I don't know if this breaks anything else in the script, but it stopped giving that error at the end.