lausser / check_hpasm

A plugin (monitoring-plugin, not nagios-plugin, see also http://is.gd/PP1330) which checks the hardware health of HP Proliant Servers. (May also be used for other devices which implement the CPQHLTH mib)
http://labs.consol.de/nagios/check_hpasm/
GNU General Public License v2.0
16 stars 18 forks source link

Spare drive is not recognized #34

Open teanva opened 4 months ago

teanva commented 4 months ago

SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.0 = INTEGER: 2 SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.1 = INTEGER: 2 SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.2 = INTEGER: 2 SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.3 = INTEGER: 2 SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.4 = INTEGER: 2 SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.5 = INTEGER: 2 SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.6 = INTEGER: 10 SNMPv2-SMI::enterprises.232.3.2.5.1.1.6.0.7 = INTEGER: 2

./check_hpasm: CRITICAL - physical drive 0:6 is value_10, System: 'proliant dl360 gen10'

./check_hpasm -v: physical drive 0:0 is ok physical drive 0:1 is ok physical drive 0:2 is ok physical drive 0:3 is ok physical drive 0:4 is ok physical drive 0:5 is ok physical drive 0:6 is value_10 physical drive 0:7 is ok

Fix:

cpqDaPhyDrvStatusValue => {
          1 => "other",
          2 => "ok",
          3 => "failed",
          4 => "predictiveFailure",
          5 => "erasing",
          6 => "eraseDone",
          7 => "eraseQueued",
          8 => "ssdWearOut",
          9 => "notAuthenticated",

      },

sub check {
  my $self = shift;
  $self->blacklist('dapd', $self->{name});
  $self->add_info(
      sprintf "physical drive %s is %s",
          $self->{name}, $self->{cpqDaPhyDrvStatus});
  if ($self->{cpqDaPhyDrvStatus} ne 'ok') {
    $self->add_message(CRITICAL,
        sprintf "physical drive %s is %s",
            $self->{name}, $self->{cpqDaPhyDrvStatus});
  }
}

->

cpqDaPhyDrvStatusValue => {
          1 => "other",
          2 => "ok",
          3 => "failed",
          4 => "predictiveFailure",
          5 => "erasing",
          6 => "eraseDone",
          7 => "eraseQueued",
          8 => "ssdWearOut",
          9 => "notAuthenticated",
          10 => "spare",
      },

sub check {
  my $self = shift;
  $self->blacklist('dapd', $self->{name});
  $self->add_info(
      sprintf "physical drive %s is %s",
          $self->{name}, $self->{cpqDaPhyDrvStatus});
  if (($self->{cpqDaPhyDrvStatus} ne 'ok') && ($self->{cpqDaPhyDrvStatus} ne 'spare')) {
    $self->add_message(CRITICAL,
        sprintf "physical drive %s is %s",
            $self->{name}, $self->{cpqDaPhyDrvStatus});
  }
}

./check_hpasm: OK - System: 'proliant dl360 gen10'

./check_hpasm -v: physical drive 0:0 is ok physical drive 0:1 is ok physical drive 0:2 is ok physical drive 0:3 is ok physical drive 0:4 is ok physical drive 0:5 is ok physical drive 0:6 is spare physical drive 0:7 is ok

lausser commented 4 months ago

Hi, where did you get the information that 10 means spare? I was searching for a recent version of CPQIDA-MIB, but they all end at 9. If you could point me to a place where i can download the latest MIBs, then i can finally start to rewrite this plugin (it's very old and my other plugins have a structure which makes them way more easy to maintain)