truenas / py-SMART

Wrapper for smartctl (smartmontools)
GNU Lesser General Public License v2.1
79 stars 35 forks source link

Inconsistency in self test capabilities #49

Closed tirolerstefan closed 2 years ago

tirolerstefan commented 2 years ago

I have got a disk "LITEONIT LCT-512L9S-11 2.5 7mm 512GB". I installed (first) pySMART 1.1.0, then the latest master.

Executing a "short" self test with smartctl, directly:

# smartctl -t short /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.4.0-126-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Tue Oct 18 08:00:03 2022

Executing it using pySMART:

>>> from pySMART import Device
>>> dev = Device("/dev/sda")

>>> dev.name 
'sda'

>>> dev.model
'LITEONIT LCT-512L9S-11 2.5 7mm 512GB'

>>> dev.test_capabilities
{'offline': True, 'short': False, 'long': True, 'conveyance': False, 'selective': False}

>>> dev.run_selftest("short")
(2, "Device sda does not support the 'short' test ", None)

What is the background of "False" at test_capability "short"? Thanks!

tirolerstefan commented 2 years ago

Additional info of the device:

# smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.4.0-126-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     LITEONIT LCT-512L9S-11 2.5 7mm 512GB
Serial Number:    TW0HN71H5508549A1220
Firmware Version: HC9110D
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS, ATA/ATAPI-7 T13/1532D revision 4a
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Oct 18 08:18:02 2022 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (   10) seconds.
Offline data collection
capabilities:            (0x15) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Abort Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   1) minutes.
Extended self-test routine
recommended polling time:    (  10) minutes.
SCT capabilities:          (0x003d) SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0003   100   100   000    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0002   100   100   000    Old_age   Always       -       2376
 12 Power_Cycle_Count       0x0003   100   100   000    Pre-fail  Always       -       1000
175 Program_Fail_Count_Chip 0x0003   100   100   000    Pre-fail  Always       -       0
176 Erase_Fail_Count_Chip   0x0003   100   100   000    Pre-fail  Always       -       0
177 Wear_Leveling_Count     0x0003   100   100   000    Pre-fail  Always       -       21534
178 Used_Rsvd_Blk_Cnt_Chip  0x0003   100   100   000    Pre-fail  Always       -       0
179 Used_Rsvd_Blk_Cnt_Tot   0x0003   100   100   000    Pre-fail  Always       -       0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   100   100   005    Pre-fail  Always       -       1472
181 Program_Fail_Cnt_Total  0x0003   100   100   000    Pre-fail  Always       -       0
182 Erase_Fail_Count_Total  0x0003   100   100   000    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0003   100   100   000    Pre-fail  Always       -       0
195 Hardware_ECC_Recovered  0x0003   100   100   000    Pre-fail  Always       -       0
241 Total_LBAs_Written      0x0003   100   100   000    Pre-fail  Always       -       176271
242 Total_LBAs_Read         0x0003   100   100   000    Pre-fail  Always       -       64812

SMART Error Log Version: 0
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      2376         -
# 2  Short offline       Completed without error       00%      2376         -
# 3  Short offline       Completed without error       00%      1501         -
# 4  Short offline       Completed without error       00%      1499         -
# 5  Extended offline    Aborted by host               10%      1499         -
# 6  Short offline       Completed without error       00%      1498         -
# 7  Short offline       Completed without error       00%      1498         -
# 8  Short offline       Completed without error       00%      1498         -
# 9  Short offline       Completed without error       00%      1454         -

Selective Self-tests/Logging not supported
# smartctl -d test /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.4.0-126-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/sda: Device of type 'scsi' [SCSI] detected
/dev/sda [SAT]: Device open changed type from 'scsi' to 'sat'
/dev/sda [SAT]: Device of type 'sat' [ATA] opened
# smartctl --scan-open
/dev/sda -d sat # /dev/sda [SAT], ATA device
tirolerstefan commented 2 years ago

Further info and tests (after investigating device.py):

>>> from pySMART.smartctl import Smartctl, SMARTCTL
>>> smartctl = SMARTCTL

>>> from pySMART.utils import smartctl_type
>>> interface = smartctl_type("sata")

>>> interface
'ata'

>>> smartctl.all(interface, "/dev/sda")
['smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.4.0-126-generic] (local build)', 'Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org', '', '=== START OF INFORMATION SECTION ===', 'Device Model:     LITEONIT LCT-512L9S-11 2.5 7mm 512GB', 'Serial Number:    TW0HN71H5508549A1220', 'Firmware Version: HC9110D', 'User Capacity:    512,110,190,592 bytes [512 GB]', 'Sector Size:      512 bytes logical/physical', 'Rotation Rate:    Solid State Device', 'Device is:        Not in smartctl database [for details use: -P showall]', 'ATA Version is:   ATA8-ACS, ATA/ATAPI-7 T13/1532D revision 4a', 'SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)', 'Local Time is:    Tue Oct 18 09:10:02 2022 UTC', 'SMART support is: Available - device has SMART capability.', 'SMART support is: Enabled', '', '=== START OF READ SMART DATA SECTION ===', 'SMART overall-health self-assessment test result: PASSED', '', 'General SMART Values:', 'Offline data collection status:  (0x00)\tOffline data collection activity', '\t\t\t\t\twas never started.', '\t\t\t\t\tAuto Offline Data Collection: Disabled.', 'Self-test execution status:      (   0)\tThe previous self-test routine completed', '\t\t\t\t\twithout error or no self-test has ever ', '\t\t\t\t\tbeen run.', 'Total time to complete Offline ', 'data collection: \t\t(   10) seconds.', 'Offline data collection', 'capabilities: \t\t\t (0x15) SMART execute Offline immediate.', '\t\t\t\t\tNo Auto Offline data collection support.', '\t\t\t\t\tAbort Offline collection upon new', '\t\t\t\t\tcommand.', '\t\t\t\t\tNo Offline surface scan supported.', '\t\t\t\t\tSelf-test supported.', '\t\t\t\t\tNo Conveyance Self-test supported.', '\t\t\t\t\tNo Selective Self-test supported.', 'SMART capabilities:            (0x0003)\tSaves SMART data before entering', '\t\t\t\t\tpower-saving mode.', '\t\t\t\t\tSupports SMART auto save timer.', 'Error logging capability:        (0x01)\tError logging supported.', '\t\t\t\t\tGeneral Purpose Logging supported.', 'Short self-test routine ', 'recommended polling time: \t (   1) minutes.', 'Extended self-test routine', 'recommended polling time: \t (  10) minutes.', 'SCT capabilities: \t       (0x003d)\tSCT Status supported.', '\t\t\t\t\tSCT Error Recovery Control supported.', '\t\t\t\t\tSCT Feature Control supported.', '\t\t\t\t\tSCT Data Table supported.', '', 'SMART Attributes Data Structure revision number: 1', 'Vendor Specific SMART Attributes with Thresholds:', 'ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE', '  5 Reallocated_Sector_Ct   0x0003   100   100   000    Pre-fail  Always       -       0', '  9 Power_On_Hours          0x0002   100   100   000    Old_age   Always       -       2376', ' 12 Power_Cycle_Count       0x0003   100   100   000    Pre-fail  Always       -       1000', '175 Program_Fail_Count_Chip 0x0003   100   100   000    Pre-fail  Always       -       0', '176 Erase_Fail_Count_Chip   0x0003   100   100   000    Pre-fail  Always       -       0', '177 Wear_Leveling_Count     0x0003   100   100   000    Pre-fail  Always       -       21534', '178 Used_Rsvd_Blk_Cnt_Chip  0x0003   100   100   000    Pre-fail  Always       -       0', '179 Used_Rsvd_Blk_Cnt_Tot   0x0003   100   100   000    Pre-fail  Always       -       0', '180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   100   100   005    Pre-fail  Always       -       1472', '181 Program_Fail_Cnt_Total  0x0003   100   100   000    Pre-fail  Always       -       0', '182 Erase_Fail_Count_Total  0x0003   100   100   000    Pre-fail  Always       -       0', '187 Reported_Uncorrect      0x0003   100   100   000    Pre-fail  Always       -       0', '195 Hardware_ECC_Recovered  0x0003   100   100   000    Pre-fail  Always       -       0', '241 Total_LBAs_Written      0x0003   100   100   000    Pre-fail  Always       -       176272', '242 Total_LBAs_Read         0x0003   100   100   000    Pre-fail  Always       -       64812', '', 'SMART Error Log Version: 0', 'No Errors Logged', '', 'SMART Self-test log structure revision number 1', 'Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error', '# 1  Short offline       Completed without error       00%      2376         -', '# 2  Short offline       Completed without error       00%      2376         -', '# 3  Short offline       Completed without error       00%      1501         -', '# 4  Short offline       Completed without error       00%      1499         -', '# 5  Extended offline    Aborted by host               10%      1499         -', '# 6  Short offline       Completed without error       00%      1498         -', '# 7  Short offline       Completed without error       00%      1498         -', '# 8  Short offline       Completed without error       00%      1498         -', '# 9  Short offline       Completed without error       00%      1454         -', '', 'Selective Self-tests/Logging not supported', '', '']

>>> line = '\t\t\t\t\tSelf-test supported.'

>>> test_capabilities = dict()

>>> if 'Self-test supported' in line:
...   test_capabilities['short'] = 'No' not in line
... 
>>> test_capabilities
{'short': True}
ralequi commented 2 years ago

I think you are correct. It looks like there is an error on the device.py, let me check it twice and add some tests to verify it in the longterm

Thank you for submitting this issue

ralequi commented 2 years ago

Hi @tirolerstefan ,

I've pushed a fix just right now to the master branch. Please, check it and confirm everything works.

Feel free to open this issue again if something is still broken or open a new issue if you find something else. Thanks for taking your time reporting this issue.

tirolerstefan commented 2 years ago

Thank you very much for this fast fix!

>>> from pySMART import Device

>>> dev = Device("/dev/sda")

>>> dev.test_capabilities
{'offline': True, 'short': True, 'long': True, 'conveyance': False, 'selective': False}

>>> dev.run_selftest("short")
(0, 'Self-test started successfully', 'Tue Oct 18 11:32:59 2022')
tirolerstefan commented 2 years ago

@ralequi - do you think it would be possible to tag this commit with a tag like 1.1.1?

ralequi commented 2 years ago

It is expected to release a version 1.2.0 when python 3.11 finally releases... But I agree this is a bug that may be released sooner. Let me check if I can cherrypick it

ralequi commented 2 years ago

It's harder than I imagined and I don't want to mess up the history...

Python 3.11 would be released on October 24, so I think we can wait for it

tirolerstefan commented 2 years ago

Ok, no problem, I will just use a copy of the master, meanwhile. Thanks!