BaldMansMojo / check_vmware_esx

chech_vmware_esx Fork of check_vmware_api.pl
GNU General Public License v2.0
124 stars 67 forks source link

--ignore_unknown not working as intended #178

Closed aendieh closed 2 years ago

aendieh commented 4 years ago

Hey there guys, we just rolled your script and for most servers this is a good change, but some servers still report unknowns for whatever reason so I was hoping --ignore_unknown would ignore these states, but it does not :\

Peculiarly it always reports 58 sensors as unknown, never more, never less.

Command:

'/usr/lib/nagios/plugins_bits/check_vmware_esx' '-H' '' '-S' 'runtime' '-p' '' '-t' '10' '-u' '' '-s' 'health' '--ignore_unknown'

UNKNOWN - 58 health issue(s) found in 67 checks: 1) UNKNOWN[storage] Status of Disk or Disk Bay 6 C1 P2I Bay 6 --- 0.4.6.64: Cannot report on the current health state of the element 2) UNKNOWN[storage] Status of Disk or Disk Bay 5 C1 P2I Bay 5 --- 0.4.5.63: Cannot report on the current health state of the element 3) UNKNOWN[storage] Status of Disk or Disk Bay 4 C1 P1I Bay 4 --- 0.4.4.62: Cannot report on the current health state of the element 4) UNKNOWN[storage] Status of Disk or Disk Bay 3 C1 P1I Bay 3 --- 0.4.3.61: Cannot report on the current health state of the element 5) UNKNOWN[storage] Status of Disk or Disk Bay 2 C1 P1I Bay 2 --- 0.4.2.60: Cannot report on the current health state of the element 6) UNKNOWN[storage] Status of Disk or Disk Bay 1 C1 P1I Bay 1 --- 0.4.1.59: Cannot report on the current health state of the element 7) UNKNOWN[systemBoard] Status of System Board 10 Power Meter --- 0.7.10.55: Cannot report on the current health state of the element 8) UNKNOWN[power] Status of Power Supply 3 Power Supplies --- 0.10.3.56: Cannot report on the current health state of the element 9) UNKNOWN[power] Status of Power Supply 2 Power Supply 2 --- 0.10.2.54: Cannot report on the current health state of the element 10) UNKNOWN[power] Status of Power Supply 1 Power Supply 1 --- 0.10.1.53: Cannot report on the current health state of the element 11) UNKNOWN[memory] Status of System Board 11 Memory --- 0.7.11.58: Cannot report on the current health state of the element 12) UNKNOWN[fan] Status of System Board 9 Fans --- 0.7.9.57: Cannot report on the current health state of the element 13) UNKNOWN[fan] Status of System Board 8 Fan Block 8 --- 0.7.8.52: Cannot report on the current health state of the element 14) UNKNOWN[fan] Status of System Board 7 Fan Block 7 --- 0.7.7.51: Cannot report on the current health state of the element 15) UNKNOWN[fan] Status of System Board 6 Fan Block 6 --- 0.7.6.50: Cannot report on the current health state of the element 16) UNKNOWN[fan] Status of System Board 5 Fan Block 5 --- 0.7.5.49: Cannot report on the current health state of the element 17) UNKNOWN[fan] Status of System Board 4 Fan Block 4 --- 0.7.4.48: Cannot report on the current health state of the element 18) UNKNOWN[fan] Status of System Board 3 Fan Block 3 --- 0.7.3.47: Cannot report on the current health state of the element 19) UNKNOWN[fan] Status of System Board 2 Fan Block 2 --- 0.7.2.46: Cannot report on the current health state of the element 20) UNKNOWN[fan] Status of System Board 1 Fan Block 1 --- 0.7.1.45: Cannot report on the current health state of the element 21) UNKNOWN[temperature] Status of Battery 1 42-SuperCAP Max --- 0.40.1.44: Cannot report on the current health state of the element 22) UNKNOWN[temperature] Status of Other 12 41-Sys Exhaust --- 0.66.12.43: Cannot report on the current health state of the element 23) UNKNOWN[temperature] Status of Other 11 40-Sys Exhaust --- 0.66.11.42: Cannot report on the current health state of the element 24) UNKNOWN[temperature] Status of Other 10 39-Sys Exhaust --- 0.66.10.41: Cannot report on the current health state of the element 25) UNKNOWN[temperature] Status of Other 9 38-System Board --- 0.66.9.40: Cannot report on the current health state of the element 26) UNKNOWN[temperature] Status of Other 8 37-System Board --- 0.66.8.39: Cannot report on the current health state of the element 27) UNKNOWN[temperature] Status of Other 7 36-PCI 2 Zone --- 0.66.7.38: Cannot report on the current health state of the element 28) UNKNOWN[temperature] Status of Add-in Card 3 35-LOM Card --- 0.11.3.37: Cannot report on the current health state of the element 29) UNKNOWN[temperature] Status of Other 6 34-PCI 1 Zone --- 0.66.6.36: Cannot report on the current health state of the element 30) UNKNOWN[temperature] Status of Other 5 33-PCI 1 Zone --- 0.66.5.35: Cannot report on the current health state of the element 31) UNKNOWN[temperature] Status of Other 4 32-HD Cntlr Zone --- 0.66.4.34: Cannot report on the current health state of the element 32) UNKNOWN[temperature] Status of Other 3 31-HD Controller --- 0.66.3.33: Cannot report on the current health state of the element 33) UNKNOWN[temperature] Status of Power Domain 10 30-VR P2Mem Zone --- 0.19.10.32: Cannot report on the current health state of the element 34) UNKNOWN[temperature] Status of Power Domain 9 29-VR P2Mem Zone --- 0.19.9.31: Cannot report on the current health state of the element 35) UNKNOWN[temperature] Status of Power Domain 8 28-VR P1Mem Zone --- 0.19.8.30: Cannot report on the current health state of the element 36) UNKNOWN[temperature] Status of Power Domain 7 27-VR P1Mem Zone --- 0.19.7.29: Cannot report on the current health state of the element 37) UNKNOWN[temperature] Status of Power Domain 6 26-VR P2 Mem --- 0.19.6.28: Cannot report on the current health state of the element 38) UNKNOWN[temperature] Status of Power Domain 5 25-VR P2 Mem --- 0.19.5.27: Cannot report on the current health state of the element 39) UNKNOWN[temperature] Status of Power Domain 4 24-VR P1 Mem --- 0.19.4.26: Cannot report on the current health state of the element 40) UNKNOWN[temperature] Status of Power Domain 3 23-VR P1 Mem --- 0.19.3.25: Cannot report on the current health state of the element 41) UNKNOWN[temperature] Status of Power Domain 2 22-VR P2 --- 0.19.2.24: Cannot report on the current health state of the element 42) UNKNOWN[temperature] Status of Power Domain 1 21-VR P1 --- 0.19.1.23: Cannot report on the current health state of the element 43) UNKNOWN[temperature] Status of Power Supply 7 18-P/S 2 Zone --- 0.10.7.20: Cannot report on the current health state of the element 44) UNKNOWN[temperature] Status of Power Supply 6 17-P/S 2 Inlet --- 0.10.6.19: Cannot report on the current health state of the element 45) UNKNOWN[temperature] Status of Power Supply 5 16-P/S 1 Zone --- 0.10.5.18: Cannot report on the current health state of the element 46) UNKNOWN[temperature] Status of Power Supply 4 15-P/S 1 Inlet --- 0.10.4.17: Cannot report on the current health state of the element 47) UNKNOWN[temperature] Status of Other 2 14-Chipset1 Zone --- 0.66.2.16: Cannot report on the current health state of the element 48) UNKNOWN[temperature] Status of Other 1 13-Chipset 1 --- 0.66.1.15: Cannot report on the current health state of the element 49) UNKNOWN[temperature] Status of Disk or Disk Bay 1 12-HD Max --- 0.4.1.14: Cannot report on the current health state of the element 50) UNKNOWN[temperature] Status of Memory Device 8 11-P2 Mem Zone --- 0.32.8.13: Cannot report on the current health state of the element 51) UNKNOWN[temperature] Status of Memory Device 7 10-P2 Mem Zone --- 0.32.7.12: Cannot report on the current health state of the element 52) UNKNOWN[temperature] Status of Memory Device 6 09-P1 Mem Zone --- 0.32.6.11: Cannot report on the current health state of the element 53) UNKNOWN[temperature] Status of Memory Device 5 08-P1 Mem Zone --- 0.32.5.10: Cannot report on the current health state of the element 54) UNKNOWN[temperature] Status of Memory Device 4 07-P2 DIMM 7-12 --- 0.32.4.9: Cannot report on the current health state of the element 55) UNKNOWN[temperature] Status of Memory Device 2 05-P1 DIMM 7-12 --- 0.32.2.7: Cannot report on the current health state of the element 56) UNKNOWN[temperature] Status of Other 2 03-CPU 2 --- 0.65.2.5: Cannot report on the current health state of the element 57) UNKNOWN[temperature] Status of Other 1 02-CPU 1 --- 0.65.1.4: Cannot report on the current health state of the element 58) UNKNOWN[temperature] Status of Other 1 01-Inlet Ambient --- 0.64.1.3: Cannot report on the current health state of the element

aendieh commented 4 years ago

System Information ----------------------------------------------------------+------------------------------------------------------------------------ Server Name --- Product Name ProLiant DL360p Gen8 Server Serial Number ---

Status ----------------------------------------------------------+------------------------------------------------------------------------ System Health OK iLO Health OK

Subsystems and Devices Status ----------------------------------------------------------+------------------------------------------------------------------------ Agentless Management Service OK BIOS/Hardware Health OK Fan Redundancy Redundant Fans OK Memory OK Network OK Power Status Redundant Power Supplies OK Processors OK Storage OK Temperatures OK

BaldMansMojo commented 2 years ago

Read the docs more carefully. This option means to ignore exit code 3 (Unknown) in check - not supressing messages containing string unknown. To this add a blcklist. that should work.