Napsty / check_smart

Monitoring Plugin to check hard drives, solid state drives and NVMe drives using SMART
https://www.claudiokuenzler.com/monitoring-plugins/check_smart.php
GNU General Public License v3.0
67 stars 20 forks source link

flag to disable temperature check #82

Closed ggzengel closed 2 years ago

ggzengel commented 2 years ago

Some disks have strange max temp (25°C ???) and build date (year 2002 ???):

# /var/lib/icinga2/checks/check_smart.pl -d /dev/sda -i auto
CRITICAL: Drive  Hitachi HDS722020ALA330 S/N 11111:  Disk temperature is higher than maximum|temperature=30;;25

# /var/lib/icinga2/checks/check_smart.pl -g /dev/sd\? -i auto
CRITICAL: [/dev/sda] - Disk temperature is higher than maximum --- [/dev/sdb] - Disk temperature is higher than maximum --- [/dev/sdc] - Disk temperature is higher than maximum --- [/dev/sdd] - Disk temperature is higher than maximum --- [/dev/sde] - Disk temperature is higher than maximum --- [/dev/sdf] - Disk temperature is higher than maximum --- [/dev/sdg] - Disk temperature is higher than maximum --- [/dev/sdh] - Disk temperature is higher than maximum --- [/dev/sdi] - Disk temperature is higher than maximum --- [/dev/sdj] - Disk temperature is higher than maximum --- [/dev/sdk] - Disk temperature is higher than maximum --- [/dev/sdl] - Disk temperature is higher than maximum --- [/dev/sdm] - Disk temperature is higher than maximum --- [/dev/sdn] - Disk temperature is higher than maximum --- [/dev/sdo] - Disk temperature is higher than maximum --- [/dev/sdp] - Disk temperature is higher than maximum|
# smartctl -a -d auto /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.11.22-3-pve] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               Hitachi
Product:              HDS722020ALA330
Revision:             R001
Compliance:           SPC-3
User Capacity:        2,000,398,934,016 bytes [2.00 TB]
Logical block size:   512 bytes
Rotation Rate:        10000 rpm
Logical Unit id:      
Serial number:        
Device type:          disk
Transport protocol:   Fibre channel (FCP-2)
Local Time is:        Sun Apr 10 22:48:26 2022 UTC
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     30 C
Drive Trip Temperature:        25 C

Manufactured in week 30 of year 2002
Specified cycle count over device lifetime:  4278190080
Accumulated start-stop cycles:  256
Elements in grown defect list: 0

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0          0.000           0
write:         0        0         0         0          0          0.000           0

Non-medium error count:        0

Device does not support Self Test logging
# smartctl -a /dev/sda -A
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.27-1-pve] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               Seagate
Product:              ST4000VN008-2DR1
Revision:             R001
Compliance:           SPC-3
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Rotation Rate:        10000 rpm
Logical Unit id:      
Serial number:        
Device type:          disk
Transport protocol:   Fibre channel (FCP-2)
Local Time is:        Sun Apr 10 22:55:42 2022 UTC
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     30 C
Drive Trip Temperature:        25 C

Manufactured in week 30 of year 2002
Specified cycle count over device lifetime:  4278190080
Accumulated start-stop cycles:  256
Elements in grown defect list: 0

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0          0.000           0
write:         0        0         0         0          0          0.000           0

Non-medium error count:        0

Device does not support Self Test logging
Napsty commented 2 years ago

Some disks have strange max temp (25°C ???) and build date (year 2002 ???):

It seems that these drives were wrongly identified (as very old drives) and show now different values in the SMART information section. You should create a ticket concerning this in the smartmontools project -> https://www.smartmontools.org/report

However your idea to disable the temperature check is a valid idea, will do that.

Napsty commented 2 years ago

@ggzengel can you please test with:

https://raw.githubusercontent.com/Napsty/check_smart/issue-82/check_smart.pl

Use the newly added parameter --skip-temp-check