Yogibaer75 / Check_MK-Things

From check plugins to website extensions
61 stars 18 forks source link

Slow execution against MX7000 #21

Closed rschitz closed 1 year ago

rschitz commented 2 years ago

Hi and thank you very much for this amazing plugin, however, i got some issue against MX7000 enclosures.

The execution is very fast against a brand new R750 but really slow against any MX7000 enclosure:

image

If i'm lucky i get PSU data but most of the time i get nothing

Yogibaer75 commented 2 years ago

It is important where it is slow. If you run the agent manually you should see at what part it takes so much time. I suspect it is the HDD data or the memory modules. These data must be fetched for every module/HDD and this is very time consuming if you have many drives or memory modules.

rschitz commented 2 years ago

I am running it manually to show you the timing, it's slow for every module. I'll make a small video to show you. And for your information, it's an blade enclosure so there is only a management module so, mainly fan, temp and power

Yogibaer75 commented 2 years ago

If it is slow on every part then i have no idea how to speedup things.

rschitz commented 2 years ago

CMC5.txt CMC6.txt Could you please have a look at the results and tell me why it so empty?

rschitz commented 2 years ago

For CMC6 i got this : image

For CMC5 (latest firmware) i only got this : image

Yogibaer75 commented 2 years ago

I see the problem inside the agent output. It also gives me the idea why it is so slow. If you let the special agent run against a blade center like your MX7000, then it fetches all the data from all the blades. If i see it correctly then in CMC5 there are 7 blades or 6 blades and the blade system itself.

One other point what i see as a difference to some of my iDRAC9 devices in the version of the output definition. You can compare this to your other working normal server systems.

Here one example.

Your output '@odata.context': '/redfish/v1/$metadata#Chassis.v1_6_0.Chassis', '@odata.type': '#Chassis.v1_6_0.Chassis' '@odata.context': '/redfish/v1/$metadata#Power.v1_0_0.Power', '@odata.type': '#Power.v1_0_0.Power' '@odata.context': '/redfish/v1/$metadata#Thermal.v1_0_0.Thermal', '@odata.type': '#Thermal.v1_0_0.Thermal'

compared to the output from one of my normal iDRAC9 servers '@odata.context': '/redfish/v1/$metadata#Chassis.Chassis', '@odata.type': '#Chassis.v1_13_0.Chassis' '@odata.context': '/redfish/v1/$metadata#Power.Power', '@odata.type': '#Power.v1_6_1.Power' '@odata.context': '/redfish/v1/$metadata#Thermal.Thermal', '@odata.type': '#Thermal.v1_6_2.Thermal'

rschitz commented 2 years ago

Hi, thanks for having a look at it. No so far the enclosure is empty:

image

Here is also what i could get of an smnpwalk with the dell mib: DELL-MM-MIB-SMIv2::dmmProductName.0 = STRING: "PowerEdge MX7000" DELL-MM-MIB-SMIv2::dmmProductShortName.0 = STRING: "PowerEdge MX7000" DELL-MM-MIB-SMIv2::dmmProductDescription.0 = STRING: "7U Modular Chassis Platform" DELL-MM-MIB-SMIv2::dmmProductManufacturer.0 = STRING: "Dell Inc" DELL-MM-MIB-SMIv2::dmmProductVersion.0 = STRING: "1.40.20" DELL-MM-MIB-SMIv2::dmmChassisServiceTag.0 = STRING: "XXXXXXX" DELL-MM-MIB-SMIv2::dmmProductURL.0 = STRING: "https://10.102.15.113:443" DELL-MM-MIB-SMIv2::dmmProductChassisAssetTag.0 = STRING: "Not Available" DELL-MM-MIB-SMIv2::dmmProductChassisName.0 = STRING: "MX-XXXXXXX DELL-MM-MIB-SMIv2::dmmProductType.0 = INTEGER: mxMM(3) DELL-MM-MIB-SMIv2::dmmProductChassisDataCenter.0 = STRING: "Not Available" DELL-MM-MIB-SMIv2::dmmProductChassisAisle.0 = STRING: "Not Available" DELL-MM-MIB-SMIv2::dmmProductChassisRack.0 = STRING: "Not Available" DELL-MM-MIB-SMIv2::dmmProductChassisRackSlot.0 = Wrong Type (should be OCTET STRING): INTEGER: 1 DELL-MM-MIB-SMIv2::dmmProductChassisModel.0 = STRING: "PowerEdge MX7000" DELL-MM-MIB-SMIv2::dmmProductChassisExpressServiceCode.0 = STRING: "XXXXXXXXXXX" DELL-MM-MIB-SMIv2::dmmProductChassisSystemID.0 = INTEGER: 2031 DELL-MM-MIB-SMIv2::dmmFirmwareVersion.0 = STRING: "1.40.20" DELL-MM-MIB-SMIv2::dmmFirmwareVersion2.0 = STRING: "1.40.20" DELL-MM-MIB-SMIv2::dmmGlobalSystemStatus.0 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmIOMCurrStatus.0 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmRedCurrStatus.0 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmPowerCurrStatus.0 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmFanCurrStatus.0 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmBladeCurrStatus.0 = INTEGER: unknown(2) DELL-MM-MIB-SMIv2::dmmTempCurrStatus.0 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmMMCurrStatus.0 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmChassisFrontPanelAmbientTemperature.0 = INTEGER: 27 DELL-MM-MIB-SMIv2::dmmPowerChassisIndex.1 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPowerIdlePower.1 = STRING: "8370" DELL-MM-MIB-SMIv2::dmmPowerKWhCumulative.1 = STRING: "61" DELL-MM-MIB-SMIv2::dmmPowerKWhCumulativeTime.1 = STRING: "2022-11-04T00:59:17+0100" DELL-MM-MIB-SMIv2::dmmPowerWattsPeakUsage.1 = STRING: "767" DELL-MM-MIB-SMIv2::dmmPowerWattsPeakTime.1 = STRING: "2022-10-30 21:18:18 +0000 UTC" DELL-MM-MIB-SMIv2::dmmPowerWattsMinUsage.1 = STRING: "374" DELL-MM-MIB-SMIv2::dmmPowerWattsMinTime.1 = STRING: "2022-11-03 17:56:13 +0000 UTC" DELL-MM-MIB-SMIv2::dmmPowerWattsReading.1 = STRING: "630" DELL-MM-MIB-SMIv2::dmmPSUChassisIndex.1.1 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPSUChassisIndex.1.2 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPSUChassisIndex.1.3 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPSUChassisIndex.1.4 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPSUChassisIndex.1.5 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPSUChassisIndex.1.6 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPSUIndex.1.1 = INTEGER: 1 DELL-MM-MIB-SMIv2::dmmPSUIndex.1.2 = INTEGER: 2 DELL-MM-MIB-SMIv2::dmmPSUIndex.1.3 = INTEGER: 3 DELL-MM-MIB-SMIv2::dmmPSUIndex.1.4 = INTEGER: 4 DELL-MM-MIB-SMIv2::dmmPSUIndex.1.5 = INTEGER: 5 DELL-MM-MIB-SMIv2::dmmPSUIndex.1.6 = INTEGER: 6 DELL-MM-MIB-SMIv2::dmmPSULocation.1.1 = STRING: "PSU.Slot.1" DELL-MM-MIB-SMIv2::dmmPSULocation.1.2 = STRING: "PSU.Slot.2" DELL-MM-MIB-SMIv2::dmmPSULocation.1.3 = STRING: "PSU.Slot.3" DELL-MM-MIB-SMIv2::dmmPSULocation.1.4 = STRING: "PSU.Slot.4" DELL-MM-MIB-SMIv2::dmmPSULocation.1.5 = STRING: "PSU.Slot.5" DELL-MM-MIB-SMIv2::dmmPSULocation.1.6 = STRING: "PSU.Slot.6" DELL-MM-MIB-SMIv2::dmmPSUState.1.1 = STRING: "Present" DELL-MM-MIB-SMIv2::dmmPSUState.1.2 = STRING: "Present" DELL-MM-MIB-SMIv2::dmmPSUState.1.3 = STRING: "Present" DELL-MM-MIB-SMIv2::dmmPSUState.1.4 = STRING: "Present" DELL-MM-MIB-SMIv2::dmmPSUState.1.5 = STRING: "Present" DELL-MM-MIB-SMIv2::dmmPSUState.1.6 = STRING: "Present" DELL-MM-MIB-SMIv2::dmmPSUType.1.1 = STRING: "AC" DELL-MM-MIB-SMIv2::dmmPSUType.1.2 = STRING: "AC" DELL-MM-MIB-SMIv2::dmmPSUType.1.3 = STRING: "AC" DELL-MM-MIB-SMIv2::dmmPSUType.1.4 = STRING: "AC" DELL-MM-MIB-SMIv2::dmmPSUType.1.5 = STRING: "AC" DELL-MM-MIB-SMIv2::dmmPSUType.1.6 = STRING: "AC" DELL-MM-MIB-SMIv2::dmmPSUCapacity.1.1 = STRING: "3000" DELL-MM-MIB-SMIv2::dmmPSUCapacity.1.2 = STRING: "3000" DELL-MM-MIB-SMIv2::dmmPSUCapacity.1.3 = STRING: "3000" DELL-MM-MIB-SMIv2::dmmPSUCapacity.1.4 = STRING: "3000" DELL-MM-MIB-SMIv2::dmmPSUCapacity.1.5 = STRING: "3000" DELL-MM-MIB-SMIv2::dmmPSUCapacity.1.6 = STRING: "3000" DELL-MM-MIB-SMIv2::dmmPSUVoltage.1.1 = STRING: "233" DELL-MM-MIB-SMIv2::dmmPSUVoltage.1.2 = STRING: "232" DELL-MM-MIB-SMIv2::dmmPSUVoltage.1.3 = STRING: "232" DELL-MM-MIB-SMIv2::dmmPSUVoltage.1.4 = STRING: "234" DELL-MM-MIB-SMIv2::dmmPSUVoltage.1.5 = STRING: "233" DELL-MM-MIB-SMIv2::dmmPSUVoltage.1.6 = STRING: "233" DELL-MM-MIB-SMIv2::dmmPSUCurrStatus.1.1 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmPSUCurrStatus.1.2 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmPSUCurrStatus.1.3 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmPSUCurrStatus.1.4 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmPSUCurrStatus.1.5 = INTEGER: ok(3) DELL-MM-MIB-SMIv2::dmmPSUCurrStatus.1.6 = INTEGER: ok(3)

I dont see anything related to the blades but having the various status and the power consumption would be perfect. Let me know if you need to collect more data.

Thanks a lot for your help

Yogibaer75 commented 2 years ago

If you can try the new mkp https://github.com/Yogibaer75/Check_MK-Things/blob/master/check%20plugins%202.0/dell_idrac_redfish/dell_idrac_redfish-1.7.mkp It works with your data and with my normal host data. The seven devices i saw where every FC switch and all other modules inside the Bladecenter.

rschitz commented 2 years ago

it's instant now, nice job! image now the other issue it that we miss some critical information like the component status and the overall power consumption since the api doesnt expose it per psu. do you think you can also add that in your plugin please?

Yogibaer75 commented 2 years ago

The component status is there inside the output. For this i need to write a check für the system section of the output.

Yogibaer75 commented 1 year ago

Today i fixed another problem with poweredOff components and i added a check for the system state. https://github.com/Yogibaer75/Check_MK-Things/blob/master/check%20plugins%202.0/dell_idrac_redfish/dell_idrac_redfish-1.8.mkp This should now show the state of every single component (system) inside your device. I tested with your data and it gave no error. Please check with some other devices if you have.

rschitz commented 1 year ago

Great! now i can even see the blades, thanks a lot image

rschitz commented 1 year ago

Here is a new extract in case it could help you add/fix things. Again, thank you very much! CMC5.txt