Open flederohr opened 1 year ago
The display of the Last Test Age was working for years without any issues. On the last smart report i had this output:
+-------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+ |Device |Serial |Temp| Power|Start|Spin |ReAlloc|Current|Offline |Seek |Total |High | Command|Last| | |Number | | On |Stop |Retry|Sectors|Pending|Uncorrec|Errors|Seeks |Fly | Timeout|Test| | | | | Hours|Count|Count| |Sectors|Sectors | | |Writes| Count |Age | +-------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+ |ada0 ? |WD-************ |39 | 65620| 186| 0| 0| 0| 0| N/A| N/A| N/A| N/A|2732*| ... ########## SATA drive /dev/ada0 Serial: WD-************ ########## Western Digital Red (WDC ************) SMART overall-health self-assessment test result: PASSED ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 182 173 021 Pre-fail Always - 3900 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 188 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 011 011 000 Old_age Always - 65620 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 186 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 154 193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 3283 194 Temperature_Celsius 0x0022 108 094 000 Old_age Always - 39 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 No Errors Logged Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Short offline Completed without error 00% 61 -
On further analysis i found out that the S.M.A.R.T. LifeTime(hours) counter seems to have reset itself
/usr/local/sbin/smartctl -l selftest /dev/ada0 smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE-p2 amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 61 - # 2 Extended offline Completed without error 00% 65483 - # 3 Short offline Completed without error 00% 65334 - # 4 Short offline Completed without error 00% 65214 - # 5 Extended offline Completed without error 00% 65099 - # 6 Short offline Completed without error 00% 64974 - # 7 Short offline Completed without error 00% 64854 - # 8 Extended offline Completed without error 00% 64739 - # 9 Short offline Completed without error 00% 64590 -
In this resource i got the explanation that this counter is normally stored in a 16 bit field but could also differ for different HDD vendors: https://serverfault.com/questions/1041661/s-m-a-r-t-lifetime-hours-resetting-to-zero
For me i could fix the issue by adding a modulo function in the calculation testAge=sprintf("%.0f", ((onHours % 65535) - lastTestHours) / 24); https://github.com/Spearfoot/FreeNAS-scripts/blob/06ccffb9710b3d372ccefe0de4b093e00cb2a00c/smart_report.sh#L131
testAge=sprintf("%.0f", ((onHours % 65535) - lastTestHours) / 24);
Oh my, thank you! Thought the tests hadn't been running and I was panicking. Will PR this change.
The display of the Last Test Age was working for years without any issues. On the last smart report i had this output:
On further analysis i found out that the S.M.A.R.T. LifeTime(hours) counter seems to have reset itself
In this resource i got the explanation that this counter is normally stored in a 16 bit field but could also differ for different HDD vendors: https://serverfault.com/questions/1041661/s-m-a-r-t-lifetime-hours-resetting-to-zero
For me i could fix the issue by adding a modulo function in the calculation
testAge=sprintf("%.0f", ((onHours % 65535) - lastTestHours) / 24);
https://github.com/Spearfoot/FreeNAS-scripts/blob/06ccffb9710b3d372ccefe0de4b093e00cb2a00c/smart_report.sh#L131