Napsty / check_esxi_hardware

Monitoring Plugin to check the hardware of VMware ESXi servers.
https://www.claudiokuenzler.com/monitoring-plugins/check_esxi_hardware.php
70 stars 18 forks source link

Is there any way to check HealthState or Status of NIC(vmnic)? #70

Closed SHEELE41 closed 1 year ago

SHEELE41 commented 1 year ago

Before actually creating a new issue I confirm, I have read the FAQ (https://www.claudiokuenzler.com/blog/308/check-esxi-hardware-faq-frequently-asked-questions): Y I confirm, I have restarted the CIM server (/etc/init.d/sfcbd-watchdog restart) on the ESXi server and the problem remains: Y I confirm, I have cleared the server's local IPMI cache (localcli hardware ipmi sel clear) and restarted the services (/sbin/services.sh restart) on the ESXi server and the problem remains: Y

Describe the bug I added some scripts to check_esxi_hardware.py to handle the VMWare_EthernetPort and VMWare_PCIDevice classes to get NIC status. However, the CIM server didn't give me any information about the NIC status.

VMWare_Ethernetport : HealthState, OperationalStatus, Status... are always NULL. VMWare_PCIDevice : HealthState is not NULL, but always 0(Unknown). Other properties are always NULL.

Is HealthState still just a property that exists in the VMWare_Ethernetport/VMWare_PCIDevice schemas, not implemented?

Show the full plugin output, including the command with -v parameter

$ enum_instances VMware_EthernetPort root/cimv2

VMware_EthernetPort.CreationClassName="VMware_EthernetPort",DeviceID="vmnic0",SystemCreationClassName="OMC_UnitaryComputerSystem",SystemName="MY_SYSTEM_NAME"...
                    SystemName = MY_SYSTEM_NAME
       SystemCreationClassName = OMC_UnitaryComputerSystem
                      DeviceID = vmnic0
             CreationClassName = VMware_EthernetPort
                    WOLOptions = { 6,  }
                  WOLSupported = { 6,  }
                   Transceiver = 1
               PhysicalAddress = 0
     AdvertisedAutoNegotiation = true
           AdvertisedLinkModes = { AUTO_AUTO, 1000_FULL, 100_HALF, 100_FULL, 10_HALF, 10_FULL,  }
            SupportedLinkModes = { AUTO_AUTO, 1000_FULL, 100_HALF, 100_FULL, 10_HALF, 10_FULL,  }
                SupportedPorts = { 1,  }
                     AutoSense = 1
                      PortType = 53
       IdentifyingDescriptions = { PCI Bus, PCI Slot, PCI Function, PCI Vendor, PCI DeviceID, PCI Segment,  }
          OtherIdentifyingInfo = { 0x03, 0x00, 0x00, 0x14e4, 0x1657, 0x0000,  }
 ActiveMaximumTransmissionUnit = 1500
                    FullDuplex = true
              UsageRestriction = 4
          TransitioningToState = 12
                         Speed = 1000000000
                RequestedState = 5
              PermanentAddress = MY_ADDRESS_VALUE
             OperationalStatus = { 2,  }
              NetworkAddresses = { MY_ADDRESS_VALUE,  }
                          Name = vmnic0
                LinkTechnology = 2
                  EnabledState = 2
                EnabledDefault = 2
           EnabledCapabilities = { 5,  }
                   ElementName = vmnic0
                       Caption = vmnic0
                  Capabilities = { 5,  }
SupportedMaximumTransmissionUnit = (NULL)
           OtherLinkTechnology = (NULL)
                    PortNumber = (NULL)
          OtherNetworkPortType = (NULL)
                MaxQuiesceTime = (NULL)
        AdditionalAvailability = (NULL)
             TotalPowerOnHours = (NULL)
                  PowerOnHours = (NULL)
                  ErrorCleared = (NULL)
              ErrorDescription = (NULL)
                 LastErrorCode = (NULL)
                    StatusInfo = (NULL)
                  Availability = (NULL)
   PowerManagementCapabilities = (NULL)
      PowerManagementSupported = (NULL)
                   Description = (NULL)
                    InstanceID = (NULL)
                   InstallDate = (NULL)
            StatusDescriptions = (NULL)
                        Status = (NULL)
                   HealthState = (NULL)
           CommunicationStatus = (NULL)
                DetailedStatus = (NULL)
               OperatingStatus = (NULL)
                 PrimaryStatus = (NULL)
             OtherEnabledState = (NULL)
         TimeOfLastStateChange = (NULL)
      AvailableRequestedStates = (NULL)
                      MaxSpeed = (NULL)
                RequestedSpeed = (NULL)
                 OtherPortType = (NULL)
                   MaxDataSize = (NULL)
        CapabilityDescriptions = (NULL)
      OtherEnabledCapabilities = (NULL)
$ enum_instances VMWare_PCIDevice root/cimv2

VMware_PCIDevice.SystemCreationClassName="OMC_UnitaryComputerSystem",SystemName="MY_SYSTEM_NAME...
                      DeviceID = PCI 0:3:0:0
             CreationClassName = VMware_PCIDevice
                    SystemName = MY_SYSTEM_NAME
       SystemCreationClassName = OMC_UnitaryComputerSystem
                  PhysicalSlot = (NULL)
                ParentDeviceID = PCI 0:0:2:2
                 SegmentNumber = 0
                  SubClassCode = 0
                    RevisionID = 1
                      VendorID = 5348
                   PCIDeviceID = 5719
                FunctionNumber = 0
                  DeviceNumber = 0
                     BusNumber = 3
             SubsystemVendorID = 4156
                   SubsystemID = 5789
                     ClassCode = 2
                  Capabilities = { 1, 3, 5, 17, 16,  }
                RequestedState = 0
                  EnabledState = 0
                 PrimaryStatus = 0
                   HealthState = 0
                   ElementName = Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet #0
               SelfTestEnabled = (NULL)
       ExpansionROMBaseAddress = (NULL)
                  InterruptPin = (NULL)
                  LatencyTimer = (NULL)
                 CacheLineSize = (NULL)
            DeviceSelectTiming = (NULL)
        CapabilityDescriptions = (NULL)
               CommandRegister = (NULL)
                MaxQuiesceTime = (NULL)
        AdditionalAvailability = (NULL)
       IdentifyingDescriptions = (NULL)
             TotalPowerOnHours = (NULL)
                  PowerOnHours = (NULL)
          OtherIdentifyingInfo = (NULL)
                  ErrorCleared = (NULL)
              ErrorDescription = (NULL)
                 LastErrorCode = (NULL)
                    StatusInfo = (NULL)
                  Availability = (NULL)
   PowerManagementCapabilities = (NULL)
      PowerManagementSupported = (NULL)
                   Description = (NULL)
                       Caption = (NULL)
                    InstanceID = (NULL)
                   InstallDate = (NULL)
                          Name = (NULL)
             OperationalStatus = (NULL)
            StatusDescriptions = (NULL)
                        Status = (NULL)
           CommunicationStatus = (NULL)
                DetailedStatus = (NULL)
               OperatingStatus = (NULL)
             OtherEnabledState = (NULL)
                EnabledDefault = 2
         TimeOfLastStateChange = (NULL)
      AvailableRequestedStates = (NULL)
          TransitioningToState = 12
               TimeOfLastReset = (NULL)
             ProtocolSupported = (NULL)
           MaxNumberControlled = (NULL)
           ProtocolDescription = (NULL)
                   BaseAddress = (NULL)
                  MinGrantTime = (NULL)
                    MaxLatency = (NULL)

Expected behavior HealthState should be 5.

Versions:

Additional context Add any other context about the problem here.

Napsty commented 1 year ago

I just tested this on a Cisco UCS blade server and the CIM Class VMware_EthernetPort indeed does not show any values:

20230718 06:53:31 Check classe VMware_EthernetPort
20230718 06:53:31   Element Name = vmnic0
20230718 06:53:31   Element Name = vmnic1
20230718 06:53:31   Element Name = vmnic2
20230718 06:53:31   Element Name = vmnic3
20230718 06:53:31   Element Name = vmnic4
20230718 06:53:31   Element Name = vmnic5
20230718 06:53:31   Element Name = vmnic6
20230718 06:53:31   Element Name = vmnic7

However this is nothing that the check_esxi_hardware plugin can fix. If it could be fixed, it would be with an additional CIM/Information bundle from the hardware vendor or from VMware themselves. The plugin can only read what is presented from the CIM Server.

Marking as question and closing. If you feel that a change in the plugin itself could help, open again and let me know.