glensc / nagios-plugin-check_raid

Nagios/Icinga/Sensu plugin to check current server's RAID status ⛺
143 stars 84 forks source link

Feature-request hpssacli drive predictive failure #174

Open FJerusalem opened 7 years ago

FJerusalem commented 7 years ago

It would be great to implement a predictive failure status for the hp-controller-disks as well:

hpssacli ctrl all show config detail

... Physical Drives physicaldrive 4I:1:1 (port 4I:box 1:bay 1, SAS, 1 TB, OK) physicaldrive 4I:1:2 (port 4I:box 1:bay 2, SAS, 1 TB, OK) physicaldrive 4I:1:3 (port 4I:box 1:bay 3, SAS, 1 TB, OK) physicaldrive 4I:1:4 (port 4I:box 1:bay 4, SAS, 1 TB, OK) physicaldrive 4I:1:5 (port 4I:box 1:bay 5, SAS, 1 TB, Predictive Failure) physicaldrive 4I:1:6 (port 4I:box 1:bay 6, SAS, 1 TB, OK) physicaldrive 4I:1:7 (port 4I:box 1:bay 7, SAS, 1 TB, OK) physicaldrive 4I:1:8 (port 4I:box 1:bay 8, SAS, 1 TB, OK) physicaldrive 4I:1:9 (port 4I:box 1:bay 9, SAS, 1 TB, OK) physicaldrive 4I:1:10 (port 4I:box 1:bay 10, SAS, 1 TB, OK) physicaldrive 4I:1:11 (port 4I:box 1:bay 11, SAS, 1 TB, Predictive Failure) physicaldrive 4I:1:12 (port 4I:box 1:bay 12, SAS, 1 TB, OK) ...

/opt/check_raid.pl -p hpssacli -d

Visit https://github.com/glensc/nagios-plugin-check_raid#reporting-bugs how to report bugs

DEBUG EXEC: /usr/sbin/hpssacli controller all show status at /opt/check_raid.pl line 474. DEBUG EXEC: /usr/sbin/hpssacli controller slot=1 logicaldrive all show at /opt/check_raid.pl line 474. OK: hpssacli:[Smart Array P800: Array A(OK)[LUN1:OK,LUN2:OK]]

FJerusalem commented 7 years ago

Could be implemented via

MYHDDRESULT=$(hpssacli ctrl all show config detail | awk '/Physical Drives/,/Array/' | grep "Predictive Failure" | awk '{print $2;}' | tr '\n' ',') if [ -z "$MYHDDRESULT" ] then echo "OK - No HDD with predictive failure present" else echo "WARNING - Predictive failure on $MYHDDRESULT" fi

glensc commented 7 years ago

blockquote markdown is three backticks (```), not dots (...). also contributing guidelines require that you post output of all commands, not random ones of your choice. and do not truncate anything.

FJerusalem commented 7 years ago

Pardon me...

# hpssacli ctrl all show config detail

Smart Array P800 in Slot 1
   Bus Interface: PCI
   Slot: 1
   Serial Number: XXXXXXXXXXXXXX
   Cache Serial Number: XXXXXXXXXXXXXX
   RAID 6 (ADG) Status: Enabled
   Controller Status: OK
   Hardware Revision: E
   Firmware Version: 7.24
   Rebuild Priority: Medium
   Expand Priority: Medium
   Surface Scan Delay: 3 secs
   Surface Scan Mode: Idle
   Parallel Surface Scan Supported: No
   Queue Depth: Automatic
   Monitor and Performance Delay: 60  min
   Elevator Sort: Enabled
   Degraded Performance Optimization: Disabled
   Inconsistency Repair Policy: Disabled
   Wait for Cache Room: Disabled
   Surface Analysis Inconsistency Notification: Disabled
   Post Prompt Timeout: 15 secs
   Cache Board Present: True
   Cache Status: OK
   Cache Ratio: 25% Read / 75% Write
   Drive Write Cache: Disabled
   Total Cache Size: 512 MB
   Total Cache Memory Available: 456 MB
   No-Battery Write Cache: Enabled
   Cache Backup Power Source: Batteries
   Battery/Capacitor Count: 2
   Battery/Capacitor Status: Failed (Replace Batteries)
   Failed Battery Location: Battery 1
   Failed Battery Location: Battery 2
   SATA NCQ Supported: True
   Number of Ports: 4 (2 Internal / 2 External )
   Driver Name: cciss
   Driver Version: 3.6.26
   PCI Address (Domain:Bus:Device.Function): 0000:04:00.0
   Host Serial Number: XXXXXXXXXX
   Sanitize Erase Supported: False
   Primary Boot Volume: logicaldrive 1 (600508B100104D395358313146390017)
   Secondary Boot Volume: None

   Port Name: 1I
         Port ID: 0
         Port Connection Number: 0
         SAS Address: 0000000000000000
         Port Location: Internal

   Port Name: 2I
         Port ID: 1
         Port Connection Number: 1
         SAS Address: 0000000000000000
         Port Location: Internal

   Port Name: 1E
         Port ID: 2
         Port Connection Number: 2
         SAS Address: 0000000000000000
         Port Location: External

   Port Name: 2E
         Port ID: 3
         Port Connection Number: 3
         SAS Address: 0000000000000000
         Port Location: External

   HP14HDD          at Port 4I, Box 1, OK
      Power Supply Status: Not Redundant
      Vendor ID: HP
      Serial Number:
      Firmware Version: 1.12
      Drive Bays: 14
      Port: 4I
      Box: 1
      Location: Internal

   Expander 249
      Device Number: 249
      Firmware Version: 1.12
      WWID: 50001C1071540000
      Port: 4I
      Box: 1
      Vendor ID: HP

   Enclosure SEP (Vendor ID HP, Model HP14HDD) 247
      Device Number: 247
      Firmware Version: 1.12
      WWID: 50001C1071540013
      Port: 4I
      Box: 1
      Vendor ID: HP
      Model: HP14HDD

   Physical Drives
      physicaldrive 4I:1:1 (port 4I:box 1:bay 1, SAS, 1 TB, OK)
      physicaldrive 4I:1:2 (port 4I:box 1:bay 2, SAS, 1 TB, OK)
      physicaldrive 4I:1:3 (port 4I:box 1:bay 3, SAS, 1 TB, OK)
      physicaldrive 4I:1:4 (port 4I:box 1:bay 4, SAS, 1 TB, OK)
      physicaldrive 4I:1:5 (port 4I:box 1:bay 5, SAS, 1 TB, Predictive Failure)
      physicaldrive 4I:1:6 (port 4I:box 1:bay 6, SAS, 1 TB, OK)
      physicaldrive 4I:1:7 (port 4I:box 1:bay 7, SAS, 1 TB, OK)
      physicaldrive 4I:1:8 (port 4I:box 1:bay 8, SAS, 1 TB, OK)
      physicaldrive 4I:1:9 (port 4I:box 1:bay 9, SAS, 1 TB, OK)
      physicaldrive 4I:1:10 (port 4I:box 1:bay 10, SAS, 1 TB, OK)
      physicaldrive 4I:1:11 (port 4I:box 1:bay 11, SAS, 1 TB, Predictive Failure)
      physicaldrive 4I:1:12 (port 4I:box 1:bay 12, SAS, 1 TB, OK)

   Array: A
      Interface Type: SAS
      Unused Space: 0  MB (0.0%)
      Used Space: 10.9 TB (100.0%)
      Status: OK
      MultiDomain Status: OK
      Array Type: Data

      Logical Drive: 1
         Size: 16.0 GB
         Fault Tolerance: 1+0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 4112
         Strip Size: 64 KB
         Full Stripe Size: 384 KB
         Status: OK
         MultiDomain Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B100104D395358313146390017
         Disk Name: /dev/cciss/c0d0
         Mount Points: /boot 487 MB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: AA846BCCPAFGF0M9SX11F980BB
         Mirror Group 1:
            physicaldrive 4I:1:1 (port 4I:box 1:bay 1, SAS, 1 TB, OK)
            physicaldrive 4I:1:2 (port 4I:box 1:bay 2, SAS, 1 TB, OK)
            physicaldrive 4I:1:3 (port 4I:box 1:bay 3, SAS, 1 TB, OK)
            physicaldrive 4I:1:4 (port 4I:box 1:bay 4, SAS, 1 TB, OK)
            physicaldrive 4I:1:5 (port 4I:box 1:bay 5, SAS, 1 TB, Predictive Failure)
            physicaldrive 4I:1:6 (port 4I:box 1:bay 6, SAS, 1 TB, OK)
         Mirror Group 2:
            physicaldrive 4I:1:7 (port 4I:box 1:bay 7, SAS, 1 TB, OK)
            physicaldrive 4I:1:8 (port 4I:box 1:bay 8, SAS, 1 TB, OK)
            physicaldrive 4I:1:9 (port 4I:box 1:bay 9, SAS, 1 TB, OK)
            physicaldrive 4I:1:10 (port 4I:box 1:bay 10, SAS, 1 TB, OK)
            physicaldrive 4I:1:11 (port 4I:box 1:bay 11, SAS, 1 TB, Predictive Failure)
            physicaldrive 4I:1:12 (port 4I:box 1:bay 12, SAS, 1 TB, OK)
         Drive Type: Data
         LD Acceleration Method: Controller Cache
      Logical Drive: 2
         Size: 5.4 TB
         Fault Tolerance: 1+0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 64 KB
         Full Stripe Size: 384 KB
         Status: OK
         MultiDomain Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B100104D395358313146390018
         Disk Name: /dev/cciss/c0d1
         Mount Points: None
         OS Status: LOCKED
         Logical Drive Label: AA846BDBPAFGF0M9SX11F99988
         Mirror Group 1:
            physicaldrive 4I:1:1 (port 4I:box 1:bay 1, SAS, 1 TB, OK)
            physicaldrive 4I:1:2 (port 4I:box 1:bay 2, SAS, 1 TB, OK)
            physicaldrive 4I:1:3 (port 4I:box 1:bay 3, SAS, 1 TB, OK)
            physicaldrive 4I:1:4 (port 4I:box 1:bay 4, SAS, 1 TB, OK)
            physicaldrive 4I:1:5 (port 4I:box 1:bay 5, SAS, 1 TB, Predictive Failure)
            physicaldrive 4I:1:6 (port 4I:box 1:bay 6, SAS, 1 TB, OK)
         Mirror Group 2:
            physicaldrive 4I:1:7 (port 4I:box 1:bay 7, SAS, 1 TB, OK)
            physicaldrive 4I:1:8 (port 4I:box 1:bay 8, SAS, 1 TB, OK)
            physicaldrive 4I:1:9 (port 4I:box 1:bay 9, SAS, 1 TB, OK)
            physicaldrive 4I:1:10 (port 4I:box 1:bay 10, SAS, 1 TB, OK)
            physicaldrive 4I:1:11 (port 4I:box 1:bay 11, SAS, 1 TB, Predictive Failure)
            physicaldrive 4I:1:12 (port 4I:box 1:bay 12, SAS, 1 TB, OK)
         Drive Type: Data
         LD Acceleration Method: Controller Cache

      physicaldrive 4I:1:1
         Port: 4I
         Box: 1
         Bay: 1
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD4 (FW update is recommended to minimum version: HPD5)
         Serial Number: 9QJ2L8V600009917W2CM
         Model: HP      DB1000BABFF
         Current Temperature (C): 25
         Maximum Temperature (C): 46
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:2
         Port: 4I
         Box: 1
         Bay: 2
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD8
         Serial Number: 9QJ1MP7C00009901ZK4M
         Model: HP      DB1000BABFF
         Current Temperature (C): 25
         Maximum Temperature (C): 50
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:3
         Port: 4I
         Box: 1
         Bay: 3
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD5
         Serial Number: 9QJ4NT9Z000090024E2G
         Model: HP      DB1000BABFF
         Current Temperature (C): 26
         Maximum Temperature (C): 44
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:4
         Port: 4I
         Box: 1
         Bay: 4
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD8
         Serial Number: 9QJ4QSX700009004ZWZS
         Model: HP      DB1000BABFF
         Current Temperature (C): 26
         Maximum Temperature (C): 41
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:5
         Port: 4I
         Box: 1
         Bay: 5
         Status: Predictive Failure
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD4 (FW update is recommended to minimum version: HPD5)
         Serial Number: 9QJ3L8AG00009937XP4L
         Model: HP      DB1000BABFF
         Current Temperature (C): 26
         Maximum Temperature (C): 39
         PHY Count: 2
         PHY Transfer Rate: 1.5Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:6
         Port: 4I
         Box: 1
         Bay: 6
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD5
         Serial Number: 9QJ4QTBD00009004ZK8P
         Model: HP      DB1000BABFF
         Current Temperature (C): 26
         Maximum Temperature (C): 41
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:7
         Port: 4I
         Box: 1
         Bay: 7
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD3 (FW update is recommended to minimum version: HPD5)
         Serial Number: Z1N34R8600009315YX1L
         Model: HP      MB1000FBZPL
         Current Temperature (C): 25
         Maximum Temperature (C): 31
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:8
         Port: 4I
         Box: 1
         Bay: 8
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD8
         Serial Number: 9QJ8JJMR00009041XW1P
         Model: HP      MB1000BAWJP
         Current Temperature (C): 25
         Maximum Temperature (C): 36
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:9
         Port: 4I
         Box: 1
         Bay: 9
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD5
         Serial Number: 9QJ4QT7G00009004ZK46
         Model: HP      DB1000BABFF
         Current Temperature (C): 28
         Maximum Temperature (C): 46
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:10
         Port: 4I
         Box: 1
         Bay: 10
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD1
         Serial Number: Z1N136350000C234BZ7K
         Model: HP      MB1000FBZPL
         Current Temperature (C): 23
         Maximum Temperature (C): 35
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:11
         Port: 4I
         Box: 1
         Bay: 11
         Status: Predictive Failure
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD5
         Serial Number: 9QJ4QTT000009005JLUK
         Model: HP      DB1000BABFF
         Current Temperature (C): 25
         Maximum Temperature (C): 37
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

      physicaldrive 4I:1:12
         Port: 4I
         Box: 1
         Bay: 12
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 1 TB
         Drive exposed to OS: False
         Native Block Size: 512
         Rotational Speed: 7200
         Firmware Revision: HPD5
         Serial Number: 9QJ4NTJK000090023VD8
         Model: HP      DB1000BABFF
         Current Temperature (C): 26
         Maximum Temperature (C): 37
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
         Sanitize Erase Supported: False

   Enclosure SEP (Vendor ID HP, Model HP14HDD) 247
      Device Number: 247
      Firmware Version: 1.12
      WWID: 50001C1071540013
      Port: 4I
      Box: 1
      Vendor ID: HP
      Model: HP14HDD

   Expander 249
      Device Number: 249
      Firmware Version: 1.12
      WWID: 50001C1071540000
      Port: 4I
      Box: 1
      Vendor ID: HP

   Expander 250
      Device Number: 250
      Firmware Version: 1.02
      WWID: 5001438004542DBF
      Vendor ID: HP

   SEP (Vendor ID HP, Model P800) 248
      Device Number: 248
      Firmware Version: 1.02
      WWID: 5001438004542DBE
      Vendor ID: HP
      Model: P800

Sorry, I suck at pearl - Could be implemented via (bash)

MYHDDRESULT=$(hpssacli ctrl all show config detail | awk '/Physical Drives/,/Array/' | grep "Predictive Failure" | awk '{print $2;}' | tr '\n' ',')
if [ -z "$MYHDDRESULT" ]
then
echo "OK - No HDD with predictive failure present"
else
echo "WARNING - Predictive failure on $MYHDDRESULT"
fi
Ashark commented 3 years ago

Does it still need test data?