lausser / check_nwc_health

nwc = network component. This plugin checks lots of aspects of routers, switches, wlan controllers, firewalls,.....
http://labs.consol.de/nagios/check_nwc_health
GNU General Public License v2.0
151 stars 87 forks source link

check-config returning weird times and wrong results #118

Closed schindlerd closed 4 years ago

schindlerd commented 7 years ago

Hi all,

when I run the plugin in "check-config" mode to compare running/startup-config I get some weird times and wrong results.

For example:

./check_nwc_health --hostname myswitch --port 161 --protocol 2c --community community --mode check-config -vvv
I am a Cisco IOS Software, Catalyst 4500 L3 Switch Software (cat4500e-ENTSERVICESK9-M), Version 12.2(54)SG1, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2011 by Cisco Systems, Inc.
Compiled Thu 27-Jan-11 12:07
[CONFIG]
ccmHistoryRunningLastChanged: 1436776860.52 (Mon Jul 13 10:41:00 2015)
ccmHistoryRunningLastSaved: 1440059221.78 (Thu Aug 20 10:27:01 2015)
ccmHistoryStartupLastChanged: 1440013538.74 (Wed Aug 19 21:45:38 2015)
OK - saved config is up to date
checking config

When I look on the switch itself it tells me:

# sh run
! Last configuration change at 13:40:05 MET Mon Nov 21 2016 
! NVRAM config last updated at 00:51:10 MET Thu Dec 29 2016

sh start
! Last configuration change at 13:40:05 MET Mon Nov 21 2016
! NVRAM config last updated at 00:51:10 MET Thu Dec 29 2016

When run it against a Nexus device I even get future values:

I am a Cisco NX-OS(tm) n5000, Software (n5000-uk9), Version 7.1(1)N1(1), RELEASE SOFTWARE Copyright (c) 2002-2012 by Cisco Systems, Inc. Device Manager Version 6.0(2)N1(1),Compiled 4/18/2015 10:00:00
[CONFIG]
ccmHistoryRunningLastChanged: 1496031481.85 (Mon May 29 06:18:01 2017)
ccmHistoryRunningLastSaved: 1496031552.82 (Mon May 29 06:19:12 2017)
ccmHistoryStartupLastChanged: 1494519654.68 (Thu May 11 18:20:54 2017)
CRITICAL - running config is ahead of startup config since -216892 minutes. changes will be lost in case of a reboot
checking config
running config is ahead of startup config since -216892 minutes. changes will be lost in case of a reboot

Any ideas?

mhoogveld commented 7 years ago

The most obvious reason would be that the time on the devices is not correct. If the time is set correctly, could you send the output of the check with "--mode walk"

Groet, Maarten

On Thu, Dec 29, 2016 at 2:33 PM, Daniel Schindler notifications@github.com wrote:

Hi all,

when I run the plugin in "check-config" mode to compare running/startup-config I get some weird times and wrong results.

For example:

./check_nwc_health --hostname myswitch --port 161 --protocol 2c --community community --mode check-config -vvv I am a Cisco IOS Software, Catalyst 4500 L3 Switch Software (cat4500e-ENTSERVICESK9-M), Version 12.2(54)SG1, RELEASE SOFTWARE (fc1) Technical Support: http://www.cisco.com/techsupport Copyright (c) 1986-2011 by Cisco Systems, Inc. Compiled Thu 27-Jan-11 12:07 [CONFIG] ccmHistoryRunningLastChanged: 1436776860.52 (Mon Jul 13 10:41:00 2015) ccmHistoryRunningLastSaved: 1440059221.78 (Thu Aug 20 10:27:01 2015) ccmHistoryStartupLastChanged: 1440013538.74 (Wed Aug 19 21:45:38 2015) OK - saved config is up to date checking config

When I look on the switch itself it tells me:

sh run

! Last configuration change at 13:40:05 MET Mon Nov 21 2016 ! NVRAM config last updated at 00:51:10 MET Thu Dec 29 2016

sh start ! Last configuration change at 13:40:05 MET Mon Nov 21 2016 ! NVRAM config last updated at 00:51:10 MET Thu Dec 29 2016

When run it against a Nexus device I even get future values:

I am a Cisco NX-OS(tm) n5000, Software (n5000-uk9), Version 7.1(1)N1(1), RELEASE SOFTWARE Copyright (c) 2002-2012 by Cisco Systems, Inc. Device Manager Version 6.0(2)N1(1),Compiled 4/18/2015 10:00:00 [CONFIG] ccmHistoryRunningLastChanged: 1496031481.85 (Mon May 29 06:18:01 2017) ccmHistoryRunningLastSaved: 1496031552.82 (Mon May 29 06:19:12 2017) ccmHistoryStartupLastChanged: 1494519654.68 (Thu May 11 18:20:54 2017) CRITICAL - running config is ahead of startup config since -216892 minutes. changes will be lost in case of a reboot checking config running config is ahead of startup config since -216892 minutes. changes will be lost in case of a reboot

Any ideas?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lausser/check_nwc_health/issues/118, or mute the thread https://github.com/notifications/unsubscribe-auth/AFlXqnDJp9J2U-ZRm8GRYDwUQl5Gxmy4ks5rM7a1gaJpZM4LXlZS .

jperkins71 commented 7 years ago

We are seeing similar issues at our site that seem to have started within the last week or so.

#sh run

! Last configuration change at 11:05:35 CST Mon Dec 14 2015 by netadmin
! NVRAM config last updated at 17:25:05 CST Wed Feb 8 2017 by netadmin
#sh start

! Last configuration change at 11:05:35 CST Mon Dec 14 2015 by netadmin
! NVRAM config last updated at 17:25:05 CST Wed Feb 8 2017 by netadmin
#show clock
17:39:32.684 CST Wed Feb 8 2017  (current local time)

waterhouse(su): snmpget -v 1 -c "community" <swname> 1.3.6.1.4.1.9.9.43.1.1.1.0
SNMPv2-SMI::enterprises.9.9.43.1.1.1.0 = Timeticks: (680188368) 78 days, 17:24:43.68
waterhouse(su): snmpget -v 1 -c "community" <swname> 1.3.6.1.4.1.9.9.43.1.1.2.0
SNMPv2-SMI::enterprises.9.9.43.1.1.2.0 = Timeticks: (33647734) 3 days, 21:27:57.34
# ./check_nwc_health --hostname <swname> --port 161 --protocol 2c --community community --mode check-config -vvv
I am a Cisco IOS Software, C3550 Software (C3550-IPSERVICESK9-M), Version 12.2(44)SE5, RELEASE SOFTWARE (fc2)
Copyright (c) 1986-2009 by Cisco Systems, Inc.
Compiled Thu 22-Jan-09 08:29 by gereddy
[CONFIG]
ccmHistoryRunningLastChanged: 1450112774.68 (Mon Dec 14 11:06:14 2015)
ccmHistoryRunningLastSaved: 1443647368.34 (Wed Sep 30 16:09:28 2015)
ccmHistoryStartupLastChanged: 1443646673.79 (Wed Sep 30 15:57:53 2015)
CRITICAL - running config is ahead of startup config since 608091 minutes. changes will be lost in case of a reboot
checking config
running config is ahead of startup config since 608091 minutes. changes will be lost in case of a reboot

Could this be related to this Cisco bug? https://quickview.cloudapps.cisco.com/quickview/bug/CSCuq34687

meni2029 commented 4 years ago

Hello,

To me this calculation makes no sense:

  my $runningUnchangedDuration = time - $self->{ccmHistoryRunningLastChanged};
  my $startupUnchangedDuration = time - $self->{ccmHistoryStartupLastChanged};

knowing: time in Perl is "the number of non-leap seconds since epoch" ccmHistoryRunningLastChanged is the value of sysUpTime (in hundredths of a second) when the running configuration was last changed.

Correct calculation would be:

  my $runningUnchangedDuration = ( sysUpTime - $self->{ccmHistoryRunningLastChanged} ) / 100;
  my $startupUnchangedDuration = ( sysUpTime - $self->{ccmHistoryRunningLastChanged} ) / 100;

Remark, when system restarts the sysUpTime will start back from 0, will ccmHistoryRunningLastChanged and ccmHistoryRunningLastChanged be changed too ?

What do you think ?

lausser commented 4 years ago

A few lines above you find:

  foreach ((qw(ccmHistoryRunningLastChanged ccmHistoryRunningLastSaved
      ccmHistoryStartupLastChanged))) {
    if (! defined $self->{$_}) {
      $self->add_unknown(sprintf "%s is not defined", $_);
    }
    $self->{$_} = time - $self->uptime() + $self->timeticks($self->{$_});
  }
meni2029 commented 4 years ago

Hello @lausser , thanks for your reply. Then the calculation is fine, I missed this part, sorry.

The problem we have with Cisco check-config is related to the uptime value, which is taken from snmpEngineTime instead of sysUpTime. As ccmHistoryRunningLastChanged is based on sysUpTime, the calculation of $runningUnchangedDuration with uptime from snmpEngineTime gives erroneous result:

Here is the debug output of one of our checks:

#  '/usr/lib64/nagios/plugins/check_nwc_health' '--community' 'abc' '--hostname' 'x.x.x.x' '--mode' 'check-config' '--multiline' '--timeout' '120' -vvvvvvvvvvv
Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Device::check_messages

Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.2.1.1.3.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: GET: MIB-2-MIB::sysUpTime (1.3.6.1.2.1.1.3) : 1567592483
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::SNMPFRAMEWORKMIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.6.3.10.2.1.3.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::SNMPFRAMEWORKMIB
Thu Apr  2 09:42:18 2020: GET: SNMP-FRAMEWORK-MIB::snmpEngineTime (1.3.6.1.6.3.10.2.1.3.0) : 58629848
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::HOSTRESOURCESMIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.2.1.25.1.1
Thu Apr  2 09:42:18 2020: GET: HOST-RESOURCES-MIB::hrSystemUptime (1.3.6.1.2.1.25.1.1) : <undef>
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::HOSTRESOURCESMIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.2.1.25.1.1.0
Thu Apr  2 09:42:18 2020: GET: HOST-RESOURCES-MIB::hrSystemUptime (1.3.6.1.2.1.25.1.1) : <undef>
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.2.1.1.1.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: GET: MIB-2-MIB::sysDescr (1.3.6.1.2.1.1.1) : Cisco IOS Software, C2960S Software (C2960S-UNIVERSALK9-M), Version 15.0(2)SE11, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Sat 19-Aug-17 08:57 by prod_rel_team
Thu Apr  2 09:42:18 2020: snmpEngineTime says: up since: Thu May 24 19:38:10 2018 / 678d 14h 4m 8s
Thu Apr  2 09:42:18 2020: sysUptime says:      up since: Thu Oct  3 23:16:53 2019 / 181d 10h 25m 24s
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.2.1.1.2.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: GET: MIB-2-MIB::sysObjectID (1.3.6.1.2.1.1.2) : 1.3.6.1.4.1.9.1.1208
Thu Apr  2 09:42:18 2020: uptime: 58629848
Thu Apr  2 09:42:18 2020: up since: Thu May 24 19:38:10 2018
Thu Apr  2 09:42:18 2020: whoami: Cisco IOS Software, C2960S Software (C2960S-UNIVERSALK9-M), Version 15.0(2)SE11, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Sat 19-Aug-17 08:57 by prod_rel_team
Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Device::check_messages

Thu Apr  2 09:42:18 2020: I am a Cisco IOS Software, C2960S Software (C2960S-UNIVERSALK9-M), Version 15.0(2)SE11, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Sat 19-Aug-17 08:57 by prod_rel_team

Thu Apr  2 09:42:18 2020: using Classes::Cisco
Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::override_opt

Thu Apr  2 09:42:18 2020: AUTOLOAD Monitoring::GLPlugin::Commandline::override_opt

Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::check_messages

Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::AIRESPACESWITCHINGMIB
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.2.1.1.2.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::MIB2MIB
Thu Apr  2 09:42:18 2020: GET: MIB-2-MIB::sysObjectID (1.3.6.1.2.1.1.2) : 1.3.6.1.4.1.9.1.1208
Thu Apr  2 09:42:18 2020: using Classes::Cisco::IOS
Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::analyze_and_check_config_subsystem

Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::CISCOCONFIGMANMIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.4.1.9.9.43.1.1.1.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::CISCOCONFIGMANMIB
Thu Apr  2 09:42:18 2020: GET: CISCO-CONFIG-MAN-MIB::ccmHistoryRunningLastChanged (1.3.6.1.4.1.9.9.43.1.1.1.0) : 1560341019
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::CISCOCONFIGMANMIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.4.1.9.9.43.1.1.2.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::CISCOCONFIGMANMIB
Thu Apr  2 09:42:18 2020: GET: CISCO-CONFIG-MAN-MIB::ccmHistoryRunningLastSaved (1.3.6.1.4.1.9.9.43.1.1.2.0) : 1567565112
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::CISCOCONFIGMANMIB
Thu Apr  2 09:42:18 2020: cache: 1.3.6.1.4.1.9.9.43.1.1.3.0
Thu Apr  2 09:42:18 2020: i know package Monitoring::GLPlugin::SNMP::MibsAndOids::CISCOCONFIGMANMIB
Thu Apr  2 09:42:18 2020: GET: CISCO-CONFIG-MAN-MIB::ccmHistoryStartupLastChanged (1.3.6.1.4.1.9.9.43.1.1.3.0) : 1557892202
Thu Apr  2 09:42:18 2020: $self->{components}->{config_subsystem} = Classes::Cisco::IOS::Component::ConfigSubsystem->new()
Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::check_config_subsystem

Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::Component::ConfigSubsystem::check_messages

Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::Component::ConfigSubsystem::set_thresholds

Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::Component::ConfigSubsystem::check_thresholds

[CONFIG]
ccmHistoryRunningLastChanged: 1542786900.19 (Wed Nov 21 08:55:00 2018)
ccmHistoryRunningLastSaved: 1542859141.12 (Thu Nov 22 04:59:01 2018)
ccmHistoryStartupLastChanged: 1542762412.02 (Wed Nov 21 02:06:52 2018)
Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::check_messages

Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::check_messages

Thu Apr  2 09:42:18 2020: AUTOLOAD Classes::Cisco::IOS::nagios_exit

CRITICAL - running config is ahead of startup config since 717107 minutes. changes will be lost in case of a reboot
checking config
running config is ahead of startup config since 717107 minutes. changes will be lost in case of a reboot

The last change of running config was in reality done on "Wed Apr 1 13:33:43 2020" (not 717107 minutes ago)

Why snmpEngineTime has precedence over sysUpTime to compute uptime ? Maybe there's a good reason for some systems, but obviously not for our Cisco check-config case.

    if (defined $sysUptime && defined $sysDescr) {
      if ($hrSystemUptime) {
        # Bei Linux-basierten Geraeten wird snmpEngineTime viel zu haeufig
        # durchgestartet, also lieber das hier.
        $self->{uptime} = $hrSystemUptime;
        # Es sei denn, snmpEngineTime ist tatsaechlich groesser, dann gilt
        # wiederum dieses. Mag sein, dass der zahlenwert hier manchmal huepft
        # und ein Performancegraph Zacken bekommt, aber das ist mir egal.
        # es geht nicht um Graphen in Form einer ansteigenden Geraden, sondern
        # um das Erkennen von spontanen Reboots und das Vermeiden von
        # falschen Alarmen.
        if ($snmpEngineTime && $snmpEngineTime > $hrSystemUptime) {
          $self->{uptime} = $snmpEngineTime;
        }
      } elsif ($snmpEngineTime) {
        $self->{uptime} = $snmpEngineTime;
      } else {
        $self->{uptime} = $sysUptime;
      }

Maybe a code should be added to take sysUpTime for uptime if mode is check-config