Closed sempervictus closed 2 years ago
The HBA here is a very common 3008:
03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
in a supermicro chassis. So i think its the drive firmware producing that output, not the controller curtailing it.
Hm, there might not be a ton to do about that if smartctl doesn't provide much, but those are a few more colon-delimited values I can have the parser look for and at least have it collect enabled status, health status, and temperature correctly. Seems like a rated lifetime calculation could be done, too.
I wonder if "Elements in grown defect list" is akin to reallocated sectors?
Didn't realize the version of smartmontools on EL6 were so long in the tooth.
I dont think its the version of the tools doing that, i've seen similar on some Arch Linux systems too (and we run tip for most things).
Confirm its not the tool, its the disks - here's one with good output and bad output in the same chassis:
Additionally, qemu disks produce a limited output which we probably want to "handle quietly" since its a logical absurdity at this point:
# smartctl -iAH /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.76] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: QEMU
Product: QEMU HARDDISK
Revision: 2.5+
Compliance: SPC-3
User Capacity: 1,099,511,627,776 bytes [1.09 TB]
Logical block size: 512 bytes
LU is thin provisioned, LBPRZ=0
Serial number: 4d7e0d8b-ee66-4aef-ac60-bc52cd560403
Device type: disk
Local Time is: Sun Nov 14 06:57:22 2021 UTC
SMART support is: Unavailable - device lacks SMART capability.
=== START OF READ SMART DATA SECTION ===
Current Drive Temperature: 0 C
Drive Trip Temperature: 0 C
Pull the latest and see if anything's better. Your smartctl examples have been really helpful since I don't have any SCSI/SAS gear around. Also, a friend has sent me a bunch of output from his various servers with SAS and NVMe that I'll be looking over.
Pulled and modeling - seems that everything is reading single-digit overall-health values now, but then again a lot of these systems are heavily used and not exactly brand-new.
Also this is now happening:
The name change is expected, including the -d field reported by --scan
seemed to be the best way keep indexed devices unique.
If the health score values being graphed don't appear to match what smartctl's saying, I'd definitely like to know.
Will keep an eye on those as well.
I think for the /dev/XdY
disks, the -d auto
can be erased. For the more complex ones, definitely want the param in there.
-d auto
will now be hidden from the device title if present - 8d35ae409c564d4da5de3c28b852f4d04b68c235
I think we're good here - output is stable, and the remaining output issues can be tracked in #6
Some drives dont like to report full data
-iAH
or the like which causes the zenpack to fail in acquiring output from something like: