dangmocrang / check_idrac

A script to monitoring DELL IDRAC via SNMP
Other
74 stars 53 forks source link

Pulling a single PDisk gives different result than pulling the whole group. #58

Closed HeyRaphi closed 5 years ago

HeyRaphi commented 5 years ago

Hi,

awesome tool, but we have some problems getting the status of our physical disks. Three values are switched from the group view in comparison to the view of a single disk which causes an unproblematic drive to throw a warning.

Here is an example:

idrac_pdisk

I asume that the order of the values returned by snmpget is different from the ones returned by snmpwalk (which is used for groups and all) which then causes the wrong output and a false warning alert.

We have tested with different systems which all produce the same result.

When I set the value_on_alert Array in line 655 to [3,8,9] instead of [3,7,9] I can get rid of the false warning but still get the switched ouput. In order to fix the output I have to manipulate the order in lines 667 - 669 which then messes up the group output of PDisk.

If you need any more information, let me know,

best regards

Raphael Rehberg

eeshlomi commented 5 years ago

Thanks Raphael for the detailed info!

I'm afraid I'm the only one who is currently maintaining this project. I have applied a fix for this problem, in my fork: https://github.com/eeshlomi/check_idrac/commit/f424a68fdb96207fbfa0542259fd967459f50407#diff-ac0a383b76a3b4008abfa60276c9a412

If you are more comfortable with this original code, just apply the bug-fix from lines 654-656 at the aforementioned link.

Let me know if you encounter any other issues, E

HeyRaphi commented 5 years ago

Hello,

thank you for the fix, which works like expected when I alter my working file and add the Code lines.

However when I pull the latest Version of the file I get the following error when I try to pull GLOBAL:

image

Shall I open a new issue for this?

Thanks and have a nice week.

best regards Raphael Rehberg

HeyRaphi commented 5 years ago

Sorry, but I got another error. When I pull DISK#1 everything is fine. Every disk > 1 throws an error:

image

best regards Raphael Rehberg

eeshlomi commented 5 years ago

Hi Raphael,

The last thing has just been fixed - check out my latest version.

I'm investigating now the previous problem when pulling GLOBAL, if you could remove the "#-- debug " from the line just above the last failing line ("item_order...") so that it prints all matched items, in both my latest version and the original version, and post the output here, I would appreciate it.

Remember that the latest version includes many bug-fixes and improvements so that it's worth having one best and final version that works everywhere.

Thanks very much, E

eeshlomi commented 5 years ago

One more thing - Please also uncomment the print command at line #649 (My version) by removing the preceding "#--debug "

HeyRaphi commented 5 years ago

Hi,

thank you for the feedback. Here is the output from the latest Version on GitHub (f424a68 from 14th August) when pulling Global. If this is not the right version, please direct me to correct version.

image

But I assume I don't have the correct version, because the check for a Disk > 1 still fails:

image

Best regards Raphael Rehberg

eeshlomi commented 5 years ago

The last problem persists because you indeed use the version before the fix - please use https://github.com/eeshlomi/check_idrac/commit/b130f72fe5442ddb363bf76711f7352a271d6eab#diff-ac0a383b76a3b4008abfa60276c9a412.

Regarding the "Unlinked" issue - I'm trying to locate the source of the error - I'm pretty sure it pertains to your specific hardware and MIB (I have checked it with some servers successfully) and not to the recent updates (Does the original version work?).

For troubleshooting, please also uncomment the print command at line #649 (The latest version from the link above) by removing the preceding "#--debug "

Thank you, eeshlomi

HeyRaphi commented 5 years ago

My bad for using the wrong version. I am still trying to understand GitHub and everything around it.

Here is the ouput from the newest version with uncommented lines 416 and 649:

image

Our Hardware which throws the error:

PowerEdge T130. PowerEdge T140 PowerEdge T320. PowerEdge T330 PowerEdge T630.

The original version from here works when pulling global: https://github.com/dangmocrang/check_idrac/blob/master/idrac_2.2rc4

Thank you. Best regards Raphael Rehberg

eeshlomi commented 5 years ago

I checked that again and found that the old version wasn't complaining, but didn't show any result either. I have added to the latest version a small piece of error handling for cases in which the snmp agent doesn't return any data. This is the current version: https://github.com/eeshlomi/check_idrac/tree/d93738dc0bcb60f5a1c550153ba0b74160df30dd

HeyRaphi commented 5 years ago

Works, thank you.