lausser / check_nwc_health

nwc = network component. This plugin checks lots of aspects of routers, switches, wlan controllers, firewalls,.....
http://labs.consol.de/nagios/check_nwc_health
GNU General Public License v2.0
148 stars 88 forks source link

--mode=hardware-health, bad psu not detected #19

Open marcelfischer opened 10 years ago

marcelfischer commented 10 years ago

Hello, our network team told me, they have an cisco switch with a psu failure, but the hardware-health check shows that everything is ok.

Maybe you can check this, thanks!

They gave me the following output from the switch, see below Switch: WS-C3750X-48P IOS: 12.2(53)SE2

sh env all FAN 1 is OK FAN 2 is OK FAN PS-1 is NOT INITIALIZED FAN PS-2 is NOT PRESENT TEMPERATURE is OK Temperature Value: 25 Degree Celsius Temperature State: GREEN Yellow Threshold : 46 Degree Celsius Red Threshold : 60 Degree Celsius SW PID Serial# Status Sys Pwr PoE Pwr Watts


1A No Input Power Bad N/A 235/0 1B Not Present
2A C3KX-PWR-1100WAC ### OK Good Good 1100/0 2B Not Present
3A C3KX-PWR-1100WAC ### OK Good Good 1100/0 3B Not Present

SW Status RPS Name RPS Serial# RPS Port#


1 Not Present <> 2 Not Present <> 3 Not Present <>

mhoogveld commented 10 years ago

Hi Marcel,

I think the output of snmpwalk would be needed to know exactly what the script gets a input by which it must determine the check outcome. If the switch is still available in that state, you could run check_nwc_health --mode walk and attach the outcome to the issue

Groet, Maarten

On Fri, Aug 8, 2014 at 2:56 PM, marcelfischer notifications@github.com wrote:

Hello, our network team told me, they have an cisco switch with a psu failure, but the hardware-health check shows that everything is ok.

Maybe you can check this, thanks!

They gave me the following output from the switch, see below Switch: WS-C3750X-48P IOS: 12.2(53)SE2

sh env all FAN 1 is OK FAN 2 is OK FAN PS-1 is NOT INITIALIZED FAN PS-2 is NOT PRESENT TEMPERATURE is OK Temperature Value: 25 Degree Celsius Temperature State: GREEN Yellow Threshold : 46 Degree Celsius Red Threshold : 60 Degree Celsius

SW PID Serial# Status Sys Pwr PoE Pwr Watts

1A No Input Power Bad N/A 235/0 1B Not Present

2A C3KX-PWR-1100WAC ### OK Good Good 1100/0 2B Not Present

3A C3KX-PWR-1100WAC ### OK Good Good 1100/0 3B Not Present

SW Status RPS Name RPS Serial# RPS Port

1 Not Present <> 2 Not Present <> 3 Not Present <>

— Reply to this email directly or view it on GitHub https://github.com/lausser/check_nwc_health/issues/19.

marcelfischer commented 10 years ago

Hello, I tried the command check_nwc_health --mode=walk but I get this error message: Can't locate object method "snmpdump" via package "GLPlugin::Commandline::Getopt" at /usr/local/icinga/libexec/check_nwc_health line 1896.

mhoogveld commented 10 years ago

Hi Marcel,

I don't know what might be the problem in your case. What you could do it try a different version of the plugin. The --mode=walk probably hasn't changed since some early version. You could get the latest version (3.0.2.2) or 2.6.5 and try with that. (maybe even do a git clone?)

Groet, Maarten

On Fri, Aug 15, 2014 at 2:53 PM, marcelfischer notifications@github.com wrote:

Hello, I tried the command check_nwc_health --mode=walk but I get this error message: Can't locate object method "snmpdump" via package "GLPlugin::Commandline::Getopt" at /usr/local/icinga/libexec/check_nwc_health line 1896.

— Reply to this email directly or view it on GitHub https://github.com/lausser/check_nwc_health/issues/19#issuecomment-52302199 .

marcelfischer commented 10 years ago

I tried 3.0 and 2.6.5, both have the same error.

lausser commented 10 years ago

Hi, mode walk has been fixed in http://labs.consol.de/download/shinken-nagios-plugins/check_nwc_health-3.0.3.tar.gz Please run check_nwc_health --mode walk --hostname host --community community which will Output two snmpwalk commands. Run these two and mail me the resulting file with the oids. (gerhard.lausser@consol.de)

Gerhard

marcelfischer commented 10 years ago

well our network team told me that they already fixed the problem, because it was some quite important switch. So is the snmpwalk output valuable even though the switch has no problem anymore?

lausser commented 10 years ago

I can try to manipulate the snmpwalk so it reflects the "1A No Input Power Bad N/A 235/0" Situation and then look how check_nwc_health interprets the Information. So, yes, it's at least worth a try