Closed ymartin-ovh closed 1 year ago
Can you show the current Percent_Lifetime_Remain value?
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
...
202 Percent_Lifetime_Remain 0x0030 099 099 001 Old_age Offline - 1
...
Agree, simply adding a -l
parameter should be enough to check for the Percent_Lifetime_Remain
attribute. Need to check why this didn't work.
@ymartin-ovh can you please try with https://raw.githubusercontent.com/Napsty/check_smart/issue-92/check_smart.pl ? does it work?
Hello
Your patch fix warning threshold when it's not given but introduce a new bug (as your set inconditionally the value):
ok (threshold set to 90%) ./check_smart.pl -i auto -g '/dev/sda' -w Reallocated_Sector_Ct=250 -l ./check_smart.pl --skip-load-cycles -l -i auto -g '/dev/{sdb,sda}'
ko => ./check_smart.pl --skip-load-cycles -l -i auto -g '/dev/{sdb,sda}' -w Percent_Lifetime_Remain=85 OK: [/dev/sdb] - Device is clean [/dev/sdb] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90) --- [/dev/sda] - Device is clean [/dev/sda] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90)|
before, I have 85% => /usr/lib/nagios/ovh/check_smart --skip-load-cycles -l -i auto -g '/dev/{sdb,sda}' -w Percent_Lifetime_Remain=85 OK: [/dev/sdb] - Device is clean [/dev/sdb] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 85) --- [/dev/sda] - Device is clean [/dev/sda] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 85)|
Can you please run with --debug
as it's easier for me to find out what happens in the background, thx. You can combine with --hide-sn
to hide sensitive serial numbers.
./check_smart.pl --skip-load-cycles -l -i auto -g '/dev/{sdb,sda}' -w Percent_Lifetime_Remain=85 --debug --hide-sn
Found /dev/sdb
Found /dev/sda
###########################################################
(debug) CHECK 1: getting overall SMART health status for /dev/sdb
###########################################################
(debug) executing:
sudo /usr/sbin/smartctl -d auto -Hi /dev/sdb
(debug) output:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.124-ovh-vps-grsec-zfs-classid] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Micron 5100 Pro / 52x0 / 5300 SSDs
Device Model: Micron_5300_MTFDDAK480TDS
Serial Number: 22263A2BB86F
LU WWN Device Id: 5 00a075 13a2bb86f
Firmware Version: D3MU001
User Capacity: 480,103,981,056 bytes [480 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available, deterministic, zeroed
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-4 (minor revision not indicated)
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Sep 18 11:59:31 2023 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
(debug) parsing line:
Device Model: Micron_5300_MTFDDAK480TDS
(debug) found model: Micron_5300_MTFDDAK480TDS
(debug) parsing line:
Serial Number: 22263A2BB86F
(debug) Hiding serial number
(debug) found serial number <HIDDEN>
(debug) parsing line:
SMART overall-health self-assessment test result: PASSED
(debug) found string 'PASSED'; status OK
###########################################################
(debug) CHECK 2: getting silent SMART health check
###########################################################
(debug) executing:
sudo /usr/sbin/smartctl -d auto -q silent -A /dev/sdb
(debug) exit code:
0
(debug) zero exit code, status OK
###########################################################
(debug) CHECK 3: getting detailed statistics from attributes
(debug) information contains a few more potential trouble spots
(debug) plus, we can also use the information for perfdata/graphing
###########################################################
(debug) executing:
sudo /usr/sbin/smartctl -d auto -A /dev/sdb
(debug) output:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.124-ovh-vps-grsec-zfs-classid] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 050 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 001 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 6549
12 Power_Cycle_Count 0x0032 100 100 001 Old_age Always - 27
170 Reserved_Block_Pct 0x0033 100 100 010 Pre-fail Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 001 Old_age Always - 0
173 Avg_Block-Erase_Count 0x0032 098 098 000 Old_age Always - 129
174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age Always - 26
183 SATA_Int_Downshift_Ct 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 155
194 Temperature_Celsius 0x0022 066 057 000 Old_age Always - 34 (Min/Max 16/43)
195 Hardware_ECC_Recovered 0x0032 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Percent_Lifetime_Remain 0x0030 098 098 001 Old_age Offline - 2
206 Write_Error_Rate 0x000e 100 100 000 Old_age Always - 0
246 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 166575160859
247 Host_Program_Page_Count 0x0032 100 100 000 Old_age Always - 5213407814
248 Bckgnd_Program_Page_Cnt 0x0032 100 100 000 Old_age Always - 373948235
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 2161
210 RAIN_Success_Recovered 0x0032 100 100 000 Old_age Always - 0
211 Integ_Scan_Complete_Cnt 0x0032 100 100 000 Old_age Always - 63
212 Integ_Scan_Folding_Cnt 0x0032 100 100 000 Old_age Always - 1
(debug) Raw Check List ATA: Current_Pending_Sector,Reallocated_Sector_Ct,Program_Fail_Cnt_Total,Uncorrectable_Error_Cnt,Offline_Uncorrectable,Runtime_Bad_Block,Reported_Uncorrect,Reallocated_Event_Count,Erase_Fail_Count_Total
(debug) Raw Check List NVMe: Media_and_Data_Integrity_Errors
(debug) Exclude List for Checks:
(debug) Exclude List for Perfdata:
(debug) Warning Thresholds:
Percent_Lifetime_Remain=90
(debug) Raw_Read_Error_Rate not in raw check list (raw value: 0)
(debug) Reallocated_Sector_Ct is OK (0)
(debug) Power_On_Hours not in raw check list (raw value: 6549)
(debug) Power_Cycle_Count not in raw check list (raw value: 27)
(debug) Reserved_Block_Pct not in raw check list (raw value: 0)
(debug) Program_Fail_Count not in raw check list (raw value: 0)
(debug) Erase_Fail_Count not in raw check list (raw value: 0)
(debug) Avg_Block-Erase_Count not in raw check list (raw value: 129)
(debug) Unexpect_Power_Loss_Ct not in raw check list (raw value: 26)
(debug) SATA_Int_Downshift_Ct not in raw check list (raw value: 0)
(debug) End-to-End_Error not in raw check list (raw value: 0)
(debug) Reported_Uncorrect is OK (0)
(debug) Command_Timeout not in raw check list (raw value: 155)
(debug) Temperature_Celsius not in raw check list (raw value: 34)
(debug) Hardware_ECC_Recovered not in raw check list (raw value: 0)
(debug) Reallocated_Event_Count is OK (0)
(debug) Current_Pending_Sector is OK (0)
(debug) Offline_Uncorrectable is OK (0)
(debug) UDMA_CRC_Error_Count not in raw check list (raw value: 0)
(debug) Percent_Lifetime_Remain is non-zero (2) but less than 90
(debug) Write_Error_Rate not in raw check list (raw value: 0)
(debug) Total_LBAs_Written not in raw check list (raw value: 166575160859)
(debug) Host_Program_Page_Count not in raw check list (raw value: 5213407814)
(debug) Bckgnd_Program_Page_Cnt not in raw check list (raw value: 373948235)
(debug) Unused_Rsvd_Blk_Cnt_Tot not in raw check list (raw value: 2161)
(debug) RAIN_Success_Recovered not in raw check list (raw value: 0)
(debug) Integ_Scan_Complete_Cnt not in raw check list (raw value: 63)
(debug) Integ_Scan_Folding_Cnt not in raw check list (raw value: 1)
(debug) gathered perfdata:
###########################################################
(debug) LOCAL STATUS: OK, FINAL STATUS: OK
###########################################################
###########################################################
(debug) CHECK 1: getting overall SMART health status for /dev/sda
###########################################################
(debug) executing:
sudo /usr/sbin/smartctl -d auto -Hi /dev/sda
(debug) output:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.124-ovh-vps-grsec-zfs-classid] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Micron 5100 Pro / 52x0 / 5300 SSDs
Device Model: Micron_5300_MTFDDAK480TDS
Serial Number: 22263A2BB83E
LU WWN Device Id: 5 00a075 13a2bb83e
Firmware Version: D3MU001
User Capacity: 480,103,981,056 bytes [480 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available, deterministic, zeroed
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-4 (minor revision not indicated)
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Sep 18 11:59:31 2023 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
(debug) parsing line:
Device Model: Micron_5300_MTFDDAK480TDS
(debug) found model: Micron_5300_MTFDDAK480TDS
(debug) parsing line:
Serial Number: 22263A2BB83E
(debug) Hiding serial number
(debug) found serial number <HIDDEN>
(debug) parsing line:
SMART overall-health self-assessment test result: PASSED
(debug) found string 'PASSED'; status OK
###########################################################
(debug) CHECK 2: getting silent SMART health check
###########################################################
(debug) executing:
sudo /usr/sbin/smartctl -d auto -q silent -A /dev/sda
(debug) exit code:
0
(debug) zero exit code, status OK
###########################################################
(debug) CHECK 3: getting detailed statistics from attributes
(debug) information contains a few more potential trouble spots
(debug) plus, we can also use the information for perfdata/graphing
###########################################################
(debug) executing:
sudo /usr/sbin/smartctl -d auto -A /dev/sda
(debug) output:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.124-ovh-vps-grsec-zfs-classid] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 050 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 001 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 6549
12 Power_Cycle_Count 0x0032 100 100 001 Old_age Always - 27
170 Reserved_Block_Pct 0x0033 100 100 010 Pre-fail Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 001 Old_age Always - 0
173 Avg_Block-Erase_Count 0x0032 098 098 000 Old_age Always - 129
174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age Always - 26
183 SATA_Int_Downshift_Ct 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 155
194 Temperature_Celsius 0x0022 065 057 000 Old_age Always - 35 (Min/Max 16/43)
195 Hardware_ECC_Recovered 0x0032 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Percent_Lifetime_Remain 0x0030 098 098 001 Old_age Offline - 2
206 Write_Error_Rate 0x000e 100 100 000 Old_age Always - 0
246 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 166523290925
247 Host_Program_Page_Count 0x0032 100 100 000 Old_age Always - 5211799331
248 Bckgnd_Program_Page_Cnt 0x0032 100 100 000 Old_age Always - 377450209
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 2161
210 RAIN_Success_Recovered 0x0032 100 100 000 Old_age Always - 0
211 Integ_Scan_Complete_Cnt 0x0032 100 100 000 Old_age Always - 63
212 Integ_Scan_Folding_Cnt 0x0032 100 100 000 Old_age Always - 0
(debug) Raw Check List ATA: Current_Pending_Sector,Reallocated_Sector_Ct,Program_Fail_Cnt_Total,Uncorrectable_Error_Cnt,Offline_Uncorrectable,Runtime_Bad_Block,Reported_Uncorrect,Reallocated_Event_Count,Erase_Fail_Count_Total
(debug) Raw Check List NVMe: Media_and_Data_Integrity_Errors
(debug) Exclude List for Checks:
(debug) Exclude List for Perfdata:
(debug) Warning Thresholds:
Percent_Lifetime_Remain=90
(debug) Raw_Read_Error_Rate not in raw check list (raw value: 0)
(debug) Reallocated_Sector_Ct is OK (0)
(debug) Power_On_Hours not in raw check list (raw value: 6549)
(debug) Power_Cycle_Count not in raw check list (raw value: 27)
(debug) Reserved_Block_Pct not in raw check list (raw value: 0)
(debug) Program_Fail_Count not in raw check list (raw value: 0)
(debug) Erase_Fail_Count not in raw check list (raw value: 0)
(debug) Avg_Block-Erase_Count not in raw check list (raw value: 129)
(debug) Unexpect_Power_Loss_Ct not in raw check list (raw value: 26)
(debug) SATA_Int_Downshift_Ct not in raw check list (raw value: 0)
(debug) End-to-End_Error not in raw check list (raw value: 0)
(debug) Reported_Uncorrect is OK (0)
(debug) Command_Timeout not in raw check list (raw value: 155)
(debug) Temperature_Celsius not in raw check list (raw value: 35)
(debug) Hardware_ECC_Recovered not in raw check list (raw value: 0)
(debug) Reallocated_Event_Count is OK (0)
(debug) Current_Pending_Sector is OK (0)
(debug) Offline_Uncorrectable is OK (0)
(debug) UDMA_CRC_Error_Count not in raw check list (raw value: 0)
(debug) Percent_Lifetime_Remain is non-zero (2) but less than 90
(debug) Write_Error_Rate not in raw check list (raw value: 0)
(debug) Total_LBAs_Written not in raw check list (raw value: 166523290925)
(debug) Host_Program_Page_Count not in raw check list (raw value: 5211799331)
(debug) Bckgnd_Program_Page_Cnt not in raw check list (raw value: 377450209)
(debug) Unused_Rsvd_Blk_Cnt_Tot not in raw check list (raw value: 2161)
(debug) RAIN_Success_Recovered not in raw check list (raw value: 0)
(debug) Integ_Scan_Complete_Cnt not in raw check list (raw value: 63)
(debug) Integ_Scan_Folding_Cnt not in raw check list (raw value: 0)
(debug) gathered perfdata:
###########################################################
(debug) LOCAL STATUS: OK, FINAL STATUS: OK
###########################################################
(debug) final status/output: OK
(debug) drives ok: [/dev/sdb] - Device is clean [/dev/sdb] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90) [/dev/sda] - Device is clean [/dev/sda] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90)
(debug) drives nok:
(debug) msg_list: [/dev/sdb] - Device is clean [/dev/sdb] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90)^[/dev/sda] - Device is clean [/dev/sda] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90)
OK: [/dev/sdb] - Device is clean [/dev/sdb] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90) --- [/dev/sda] - Device is clean [/dev/sda] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90)|
To me it looks like the correct behaviour. Both your drives sda
and sdb
have a Percent_Lifetime_Remain value of 2:
202 Percent_Lifetime_Remain 0x0030 098 098 001 Old_age Offline - 2
202 Percent_Lifetime_Remain 0x0030 098 098 001 Old_age Offline - 2
The attribute list can be seen in the debug output.
So to test the warning threshold, you must set it equal to or lower than 2:
./check_smart.pl --skip-load-cycles -l -i auto -g '/dev/{sdb,sda}' -w Percent_Lifetime_Remain=2 --debug --hide-sn
Please try that and comment here again with your findings.
PS: I just noticed that --hide-sn didn't properly work. But that's another issue to look at ;-)
No there is an issue in your patch:
./check_smart.pl --skip-load-cycles -l -i auto -g '/dev/{sdb,sda}' -w Percent_Lifetime_Remain=85 OK: [/dev/sdb] - Device is clean [/dev/sdb] - Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90) --- [/dev/sda] -
I put 85 and the output mention 90 => but less than threshold 90
Also, in smart, lifetime value is inverted between raw value and the real meaning of lifetime remaining percentage. This is explained in drive datasheet and also in check_smart perl code.
I put 85 and the output mention 90 => but less than threshold 90
Ah yes, now I see it.
Let me try to comprehend the issue correctly.
When you want to use the Percent_Lifetime_Remain check, using -l
then the check will work and alert automatically when the value reaches 90. If the value is below 90, the plugin will output the value but below warning level:
$ ./check_smart.pl -d /dev/sda -i auto --debug -l
[...]
(debug) Warning Thresholds:
Percent_Lifetime_Remain=90
[...]
(debug) Percent_Lifetime_Remain is non-zero (2) but less than 90
[...]
OK: Drive Samsung SSD 850 EVO 500GB S/N XXX: no SMART errors detected. Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90)|Reallocated_Sector_Ct=0 Power_On_Hours=26002 Power_Cycle_Count=934 Wear_Leveling_Count=35 Used_Rsvd_Blk_Cnt_Tot=0 Program_Fail_Cnt_Total=0 Erase_Fail_Count_Total=0 Runtime_Bad_Block=0 Uncorrectable_Error_Cnt=0 Airflow_Temperature_Cel=32 ECC_Error_Rate=0 CRC_Error_Count=0 Percent_Lifetime_Remain=2 POR_Recovery_Count=12 Total_LBAs_Written=41523523747
But when you want to overwrite the Percent_Lifetime_Remain threshold (let's say 50), then your own threshold is overwritten again with 90:
$ ./check_smart.pl -d /dev/sda -i auto --debug -l -w "Percent_Lifetime_Remain=50"
[...]
(debug) Warning Thresholds:
Percent_Lifetime_Remain=90
[...]
(debug) Percent_Lifetime_Remain is non-zero (2) but less than 90
[...]
OK: Drive Samsung SSD 850 EVO 500GB S/N XXX: no SMART errors detected. Percent_Lifetime_Remain is non-zero (2) (but less than threshold 90)|Reallocated_Sector_Ct=0 Power_On_Hours=26002 Power_Cycle_Count=934 Wear_Leveling_Count=35 Used_Rsvd_Blk_Cnt_Tot=0 Program_Fail_Cnt_Total=0 Erase_Fail_Count_Total=0 Runtime_Bad_Block=0 Uncorrectable_Error_Cnt=0 Airflow_Temperature_Cel=32 ECC_Error_Rate=0 CRC_Error_Count=0 Percent_Lifetime_Remain=2 POR_Recovery_Count=12 Total_LBAs_Written=41523523747
Is that the problem this issue is about? Or did I misunderstand something?
_Note: I faked the SMARTCTL output on this drive, as the Samsung SSDs don't have a Percent_LifetimeRemain attribute.
Initially my isssue is when -w is used with another threshold definition like Reallocated_Sector_Ct, Percent_Lifetime_Remain=90 is not pushed in the warn_list (see: https://github.com/Napsty/check_smart/blob/master/check_smart.pl#L231)
when -w is used with another threshold definition like Reallocated_Sector_Ct, Percent_Lifetime_Remain=90 is not pushed in the warn_list
Yep, but this should now work.
$ ./check_smart.pl -d /dev/sda -i auto --debug -l -w "Uncorrectable_Error_Cnt=10,Reallocated_Sector_Ct=10"
[...]
(debug) Warning Thresholds:
Percent_Lifetime_Remain=90
Reallocated_Sector_Ct=10
Uncorrectable_Error_Cnt=10
[...]
Can you confirm with the latest version? -> https://raw.githubusercontent.com/Napsty/check_smart/issue-92/check_smart.pl
when -w is used with another threshold definition like Reallocated_Sector_Ct, Percent_Lifetime_Remain=90 is not pushed in the warn_list
Yep, but this should now work.
$ ./check_smart.pl -d /dev/sda -i auto --debug -l -w "Uncorrectable_Error_Cnt=10,Reallocated_Sector_Ct=10" [...] (debug) Warning Thresholds: Percent_Lifetime_Remain=90 Reallocated_Sector_Ct=10 Uncorrectable_Error_Cnt=10 [...]
Can you confirm with the latest version? -> https://raw.githubusercontent.com/Napsty/check_smart/issue-92/check_smart.pl
No your patch overwrite the user given value because of the push at the warn_list tail. The default value should be in the head of the list to do this properly. Eventually, I provide a fix in #93.
Regards
Thx for the PR. Please set your if condition in line 231: https://github.com/Napsty/check_smart/blob/master/check_smart.pl#L231
This way the Percent_Lifetime_Remain threshold is only set once and added to the warn_list array from the beginning.
The if condition l231 is not needed anymore as it is implemented l240 in #93
Just tested it locally, lgtm
-l
: Sets Percent_Lifetime_Remain=90 into warn_list :heavy_check_mark: -l -w "Percent_Lifetime_Remain=70,CRC_Error_Count=10"
works :heavy_check_mark: -l -w "CRC_Error_Count=10"
uses the default threshold of 90 again for Percent_Lifetime_Remain :heavy_check_mark: Fixed with #93
Hello
It seems there is an issue on -w option handling. When I give a threshold for a particular smartctl item (not lifetime), the Percent_Lifetime_Remain threshold is not set to 90%:
warning => ./check_smart -i auto -g '/dev/sda' -w Reallocated_Sector_Ct=250 -l ok => ./check_smart -i auto -g '/dev/sda' -w Reallocated_Sector_Ct=250,Percent_Lifetime_Remain=90 -l ok => ./check_smart -i auto -g '/dev/sda' -l
Before working on a patch, can you tell me if this behaviour is normal or not.
Regards