Closed mth309 closed 2 years ago
Just wanted to let you know that depending how this weekend goes I might find time to write the code and send you a pull request to implement the above behavior. It should only be a few lines of code for each drive type (SAS/SATA). I wanted to post the request here so you could see what I'm thinking first and let me know if you don't agree. Also if you have an opinion on the idea to colorize the test type instead of adding a new pass/fail field, or whether you prefer warning or critical color for that, please let me know. If you want to implement the feature yourself rather than wait on me by all means go for it, but if you're not in a rush I'll be adding it to my own server at some point and will send it to you after. Thanks!
@mth309 since I do not have sas drives and you do I will let you take the first pass at them.
@dak180 I just submitted a pull request for the SAS version of the code.
Currently the script collects the type of the last selftest run, and how long ago it ran. It does not collect whether the selftest was successful or failed, and I believe this is an important oversight. You could be running daily self tests, they could all be telling you about bad LBAs or other failures, and you'd never know it with the current implementation.
I don't think another field needs to be added to the summary table for selftest status, instead I would recommend using the warning color or critical color on the 'last test type' field if the last self test failed. The current script uses the critical color on the 'last test time' if it exceeds the threshold time period, but it is not currently colorizing the test type for any reason, so this seemed like a good reason to add color to that field.
Below is example json output for a SATA drive with several failing self tests. I show the non-json version as well for ease of human reading, but from a script perspective parsing the 'status.passed' json attribute as true/false seems to be the way to go for SATA.
Unfortunately smartctl does not support json output for SCSI drive selftest logs, so it would be a bit more complicated to parse test results from the non-json format.
From experience I would tell you that if you get any value other than the hyphen in any of the final 4 positions, it would be considered a test failure. Usually all 4 fields would change to a non-hyphen at the same time, but it's possible the drive might report a Key Code Qualifier (KCQ) in the final three fields without reporting an LBA where the error took place, or vice versa. In any case, if anything shows up in any of those 4 fields it's worth flagging the last test in a bad color so the user knows to look into it.