Closed vvershkov closed 5 years ago
What telegraf version are you using?
I am also having a problem with a disk not appearing in the telegraf output
› sudo smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
/dev/sdd -d scsi # /dev/sdd, SCSI device
/dev/sde -d scsi # /dev/sde, SCSI device
/dev/sdf -d scsi # /dev/sdf, SCSI device
/dev/sdg -d scsi # /dev/sdg, SCSI device
/dev/sdh -d scsi # /dev/sdh, SCSI device
/dev/sdi -d scsi # /dev/sdi, SCSI device
/dev/sdj -d scsi # /dev/sdj, SCSI device
/dev/sdk -d scsi # /dev/sdk, SCSI device
/dev/sdl -d scsi # /dev/sdl, SCSI device
/dev/sdm -d scsi # /dev/sdm, SCSI device
/dev/sdn -d scsi # /dev/sdn, SCSI device
› sudo smartctl --info --attributes --health -n standby --format=brief /dev/sdg -d scsi
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-145-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
LB provisioning type: unreported, LBPME=0, LBPRZ=0
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca250ec3c9c
Serial number: PK1334PEK49SBS
Device type: disk
Local Time is: Wed Apr 17 12:21:23 2019 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 34 C
› sudo telegraf --test --input-filter smart
2019-04-17T19:21:38Z I! Starting Telegraf 1.10.2
2019-04-17T19:21:38Z I! Using config file: /etc/telegraf/telegraf.conf
> smart_device,capacity=525112713216,device=sdj,enabled=Enabled,host=cortex,model=Crucial_CT525MX300SSD1,serial_no=16431465A85A,wwn=500a07511465a85a exit_status=0i,health_ok=true,read_error_rate=2i,temp_c=36i,udma_crc_errors=0i 1555528899000000000
> smart_device,capacity=525112713216,device=sdk,enabled=Enabled,host=cortex,model=Crucial_CT525MX300SSD1,serial_no=1651150FA577,wwn=500a0751150fa577 exit_status=0i,health_ok=true,read_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1555528899000000000
> smart_device,capacity=4000787030016,device=sde,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4EM0WN624,wwn=50014ee2b51b9d7f exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=30i,udma_crc_errors=0i 1555528899000000000
> smart_device,capacity=4000787030016,device=sdn,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4EECRN58H,wwn=50014ee20a98bd99 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=37i,udma_crc_errors=0i 1555528899000000000
> smart_device,capacity=4000787030016,device=sdl,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4E4FKJ5DV,wwn=50014ee25fc65114 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=29i,udma_crc_errors=0i 1555528899000000000
> smart_device,capacity=4000787030016,device=sdb,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4EK8ZSK37,wwn=50014ee2b51c8ebd exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=31i,udma_crc_errors=0i 1555528899000000000
> smart_device,capacity=4000787030016,device=sdm,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4E4FKJH1X,wwn=50014ee20a70d5a0 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=29i,udma_crc_errors=0i 1555528899000000000
> smart_device,capacity=4000787030016,device=sdf,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK2334PEJM9B3T,wwn=5000cca250e4f530 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=34i,udma_crc_errors=0i 1555528900000000000
> smart_device,capacity=4000787030016,device=sdc,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK2334PEK4AXTT,wwn=5000cca250ec4105 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=34i,udma_crc_errors=0i 1555528900000000000
> smart_device,capacity=4000787030016,device=sda,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEKDXVTS,wwn=5000cca250f02751 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=32i,udma_crc_errors=0i 1555528900000000000
> smart_device,capacity=4000787030016,device=sdh,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEJLL6NS,wwn=5000cca250e4a210 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=31i,udma_crc_errors=0i 1555528900000000000
> smart_device,capacity=4000787030016,device=sdd,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEKDNZ0S,wwn=5000cca250f009ad exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=35i,udma_crc_errors=0i 1555528900000000000
> smart_device,capacity=32017047552,device=sdi,enabled=Enabled,host=cortex,model=SATA\ SSD,serial_no=AF3407621C2400203590 exit_status=0i,health_ok=true,read_error_rate=0i,temp_c=30i 1555528903000000000
Note /dev/sdg
is listed in smartctl --scan
and reports data with sudo smartctl --info --attributes --health -n standby --format=brief /dev/sdg -d scsi
but does not appear in sudo telegraf --test --input-filter smart
using Telegraf 1.10.2 (git: HEAD 3303f5c3)
In my environment, I'm seeing this behavior specifically with SAS drives. SATA drives on the same HBA are fine.
Can you try this linux amd64 build and run it with --debug
. I'd love to find where the failure is at. Thanks!
Telegraf unknown (git: bugfix/5740 85b8a490)
2019-04-17T22:01:28Z I! Starting Telegraf
2019-04-17T22:01:28Z I! Using config file: /etc/telegraf/telegraf.conf
2019-04-17T22:01:28Z D! [inputs.smart] adding device: []string{"/dev/sda", "-d", "scsi", "#", "/dev/sda,", "SCSI", "device"}
2019-04-17T22:01:28Z D! [inputs.smart] adding device: []string{"/dev/sdb", "-d", "scsi", "#", "/dev/sdb,", "SCSI", "device"}
2019-04-17T22:01:28Z D! [inputs.smart] adding device: []string{"/dev/sdc", "-d", "scsi", "#", "/dev/sdc,", "SCSI", "device"}
2019-04-17T22:01:28Z D! [inputs.smart] skipping device: []string{""}
2019-04-17T22:01:28Z D! [inputs.smart] devices: []string{"/dev/sda", "/dev/sdb", "/dev/sdc"}
2019-04-17T22:01:28Z D! [inputs.smart] gatherDisk '/dev/sdb' output: "smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.18-12-pve] (local build)\nCopyright (C) 2002-16, Bruce Allen, Christian Franke, www.smar$montools.org\n\n=== START OF INFORMATION SECTION ===\nVendor: HITACHI\nProduct: HUC103030CSS600\nRevision: J350\nCompliance: SPC-4\nUser Capacity: 300,$00,000,000 bytes [300 GB]\nLogical block size: 512 bytes\nRotation Rate: 10020 rpm\nForm Factor: 2.5 inches\nLogical Unit id: 0x5000cca00a4f91bc\nSerial number: PDWDSKNE\nDevicetype: disk\nTransport protocol: SAS (SPL-3)\nLocal Time is: Wed Apr 17 15:01:28 2019 PDT\nSMART support is: Available - device has SMART capability.\nSMART support is: Enabled\nTemp$rature Warning: Disabled or Not Supported\n\n=== START OF READ SMART DATA SECTION ===\nSMART Health Status: OK\n\nCurrent Drive Temperature: 35 C\nDrive Trip Temperature: 85 C\n\nManufactured in $eek 52 of year 2009\nSpecified cycle count over device lifetime: 50000\nAccumulated start-stop cycles: 47\nElements in grown defect list: 0\n\nVendor (Seagate) cache information\n Blocks sent to initiator= 7601969522802688\n\n"> smart_device,capacity=300000000000,device=sdb,enabled=Enabled,host=pve-1 exit_status=0i 1555538488000000000
2019-04-17T22:01:28Z D! [inputs.smart] gatherDisk '/dev/sda' output: "smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.18-12-pve] (local build)\nCopyright (C) 2002-16, Bruce Allen, Christian Franke, www.smar$montools.org\n\n=== START OF INFORMATION SECTION ===\nVendor: HITACHI\nProduct: HUC103030CSS600\nRevision: J350\nCompliance: SPC-4\nUser Capacity: 300,$00,000,000 bytes [300 GB]\nLogical block size: 512 bytes\nRotation Rate: 10020 rpm\nForm Factor: 2.5 inches\nLogical Unit id: 0x5000cca00a4bdbc8\nSerial number: PDWAR9GE\nDevicetype: disk\nTransport protocol: SAS (SPL-3)\nLocal Time is: Wed Apr 17 15:01:28 2019 PDT\nSMART support is: Available - device has SMART capability.\nSMART support is: Enabled\nTemp$rature Warning: Disabled or Not Supported\n\n=== START OF READ SMART DATA SECTION ===\nSMART Health Status: OK\n\nCurrent Drive Temperature: 36 C\nDrive Trip Temperature: 85 C\n\nManufactured in $eek 52 of year 2009\nSpecified cycle count over device lifetime: 50000\nAccumulated start-stop cycles: 47\nElements in grown defect list: 0\n\nVendor (Seagate) cache information\n Blocks sent to initiator= 7270983270400000\n\n"> smart_device,capacity=300000000000,device=sda,enabled=Enabled,host=pve-1 exit_status=0i 1555538488000000000
2019-04-17T22:01:28Z D! [inputs.smart] gatherDisk '/dev/sdc' output: "smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.18-12-pve] (local build)\nCopyright (C) 2002-16, Bruce Allen, Christian Franke, www.smar$montools.org\n\n=== START OF INFORMATION SECTION ===\nModel Family: Samsung based SSDs\nDevice Model: Samsung SSD 850 PRO 256GB\nSerial Number: S39KNX0J718036J\nLU WWN Device Id: 5 002538 d4218a3d$\nFirmware Version: EXM04B6Q\nUser Capacity: 256,060,514,304 bytes [256 GB]\nSector Size: 512 bytes logical/physical\nRotation Rate: Solid State Device\nForm Factor: 2.5 inches\nDevice is: In smartctl database [for details use: -P show]\nATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c\nSATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)\nLocal Time is: Wed Apr 17 15:01:28 $019 PDT\nSMART support is: Available - device has SMART capability.\nSMART support is: Enabled\nPower mode is: ACTIVE or IDLE\n\n=== START OF READ SMART DATA SECTION ===\nSMART overall-health self-assessm$nt test result: PASSED\n\nSMART Attributes Data Structure revision number: 1\nVendor Specific SMART Attributes with Thresholds:\nID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE\n 5 Re$llocated_Sector_Ct PO--CK 100 100 010 - 0\n 9 Power_On_Hours -O--CK 097 097 000 - 14738\n 12 Power_Cycle_Count -O--CK 099 099 000 - 47\n177 Wear_Leveling$Count PO--C- 086 086 000 - 893\n179 Used_Rsvd_Blk_Cnt_Tot PO--C- 100 100 010 - 0\n181 Program_Fail_Cnt_Total -O--CK 100 100 010 - 0\n182 Erase_Fail_Count_Total -O-$CK 100 100 010 - 0\n183 Runtime_Bad_Block PO--C- 100 100 010 - 0\n187 Uncorrectable_Error_Cnt -O--CK 100 100 000 - 0\n190 Airflow_Temperature_Cel -O--CK 069 052 000 - 31\n195 ECC_Error_Rate -O-RC- 200 200 000 - 0\n199 CRC_Error_Count -OSRCK 100 100 000 - 0\n235 POR_Recovery_Count -O--C- 099 099 000 - 3$\n241 Total_LBAs_Written -O--CK 099 099 000 - 36452846103\n ||||||_ K auto-keep\n |||||__ C event count\n |||$___ R error rate\n |||____ S speed/performance\n ||_____ O updated online\n |______ P prefailure warning\n\n"> smart_attribute,device=sdc,fail=-,flags=PO--CK,host=pve-1,id=5,name=Reallocated_Sector_Ct,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=10i,value=100i,worst=100i 1555$38488000000000
> smart_attribute,device=sdc,fail=-,flags=-O--CK,host=pve-1,id=9,name=Power_On_Hours,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=14738i,threshold=0i,value=97i,worst=97i 1555538488$00000000
> smart_attribute,device=sdc,fail=-,flags=-O--CK,host=pve-1,id=12,name=Power_Cycle_Count,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=47i,threshold=0i,value=99i,worst=99i 155553848$000000000
> smart_attribute,device=sdc,fail=-,flags=PO--C-,host=pve-1,id=177,name=Wear_Leveling_Count,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=893i,threshold=0i,value=86i,worst=86i 15555$8488000000000
> smart_attribute,device=sdc,fail=-,flags=PO--C-,host=pve-1,id=179,name=Used_Rsvd_Blk_Cnt_Tot,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=10i,value=100i,worst=100i 15$5538488000000000
> smart_attribute,device=sdc,fail=-,flags=-O--CK,host=pve-1,id=181,name=Program_Fail_Cnt_Total,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=10i,value=100i,worst=100i 1$55538488000000000
> smart_attribute,device=sdc,fail=-,flags=-O--CK,host=pve-1,id=182,name=Erase_Fail_Count_Total,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=10i,value=100i,worst=100i 1555538488000000000
> smart_attribute,device=sdc,fail=-,flags=PO--C-,host=pve-1,id=183,name=Runtime_Bad_Block,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=10i,value=100i,worst=100i 1555538488000000000
> smart_attribute,device=sdc,fail=-,flags=-O--CK,host=pve-1,id=187,name=Uncorrectable_Error_Cnt,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=0i,value=100i,worst=100i 1555538488000000000
> smart_attribute,device=sdc,fail=-,flags=-O--CK,host=pve-1,id=190,name=Airflow_Temperature_Cel,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=31i,threshold=0i,value=69i,worst=52i 1555538488000000000
> smart_attribute,device=sdc,fail=-,flags=-O-RC-,host=pve-1,id=195,name=ECC_Error_Rate,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=0i,value=200i,worst=200i 1555538488000000000
> smart_attribute,device=sdc,fail=-,flags=-OSRCK,host=pve-1,id=199,name=CRC_Error_Count,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=0i,threshold=0i,value=100i,worst=100i 1555538488000000000
> smart_attribute,device=sdc,fail=-,flags=-O--C-,host=pve-1,id=235,name=POR_Recovery_Count,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=35i,threshold=0i,value=99i,worst=99i 1555538488000000000
> smart_attribute,device=sdc,fail=-,flags=-O--CK,host=pve-1,id=241,name=Total_LBAs_Written,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,raw_value=36452846103i,threshold=0i,value=99i,worst=99i 1555538488000000000
> smart_device,capacity=256060514304,device=sdc,enabled=Enabled,host=pve-1,model=Samsung\ SSD\ 850\ PRO\ 256GB,serial_no=S39KNX0J718036J,wwn=5002538d4218a3df exit_status=0i,health_ok=true,udma_crc_errors=0i 1555538488000000000
@chrishoage Can you paste the output of
sudo smartctl --info --health --attributes --tolerance=verypermissive -n standby --format=brief /dev/sdg
@vvershkov Can you paste the output of the same, but with /dev/sdc
instead of /dev/sdg
?
Thanks @ddimick. I assume a
and b
are your SAS drives?
› sudo smartctl --info --health --attributes --tolerance=verypermissive -n standby --format=brief /dev/sdg
[sudo] password for chris:
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-145-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: HGST Deskstar NAS
Device Model: HGST HDN724040ALE640
Serial Number: PK1334PEK49SBS
LU WWN Device Id: 5 000cca 250ec3c9c
Firmware Version: MJAOA5E0
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Apr 17 15:14:27 2019 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Power mode is: ACTIVE or IDLE
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 016 - 0
2 Throughput_Performance P-S--- 135 135 054 - 84
3 Spin_Up_Time POS--- 125 125 024 - 621 (Average 619)
4 Start_Stop_Count -O--C- 100 100 000 - 33
5 Reallocated_Sector_Ct PO--CK 100 100 005 - 0
7 Seek_Error_Rate PO-R-- 100 100 067 - 0
8 Seek_Time_Performance P-S--- 119 119 020 - 35
9 Power_On_Hours -O--C- 098 098 000 - 19371
10 Spin_Retry_Count PO--C- 100 100 060 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 33
192 Power-Off_Retract_Count -O--CK 100 100 000 - 764
193 Load_Cycle_Count -O--C- 100 100 000 - 764
194 Temperature_Celsius -O---- 176 176 000 - 34 (Min/Max 21/53)
196 Reallocated_Event_Count -O--CK 100 100 000 - 0
197 Current_Pending_Sector -O---K 100 100 000 - 0
198 Offline_Uncorrectable ---R-- 100 100 000 - 0
199 UDMA_CRC_Error_Count -O-R-- 200 200 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
I assume a and b are your SAS drives?
Yes, that's correct.
Thanks. @chrishoage can you also paste the output of the same command but with a disk that is being collected (anything other than /dev/sdg
)
› sudo smartctl --info --health --attributes --tolerance=verypermissive -n standby --format=brief /dev/sdh
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-145-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: HGST Deskstar NAS
Device Model: HGST HDN724040ALE640
Serial Number: PK1334PEJLL6NS
LU WWN Device Id: 5 000cca 250e4a210
Firmware Version: MJAOA5E0
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Apr 17 16:27:58 2019 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Power mode is: ACTIVE or IDLE
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 016 - 0
2 Throughput_Performance P-S--- 136 136 054 - 83
3 Spin_Up_Time POS--- 125 125 024 - 621 (Average 617)
4 Start_Stop_Count -O--C- 100 100 000 - 28
5 Reallocated_Sector_Ct PO--CK 100 100 005 - 0
7 Seek_Error_Rate PO-R-- 100 100 067 - 0
8 Seek_Time_Performance P-S--- 124 124 020 - 33
9 Power_On_Hours -O--C- 098 098 000 - 19322
10 Spin_Retry_Count PO--C- 100 100 060 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 28
192 Power-Off_Retract_Count -O--CK 100 100 000 - 30
193 Load_Cycle_Count -O--C- 100 100 000 - 30
194 Temperature_Celsius -O---- 187 187 000 - 32 (Min/Max 23/55)
196 Reallocated_Event_Count -O--CK 100 100 000 - 0
197 Current_Pending_Sector -O---K 100 100 000 - 0
198 Offline_Uncorrectable ---R-- 100 100 000 - 0
199 UDMA_CRC_Error_Count -O-R-- 200 200 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
Hm, I didn't think about it but yep, that is SAS drives.
My telegraf version is 1.10.0-1 but I can update it to 1.10.3 (I am using ubuntu 18.04 and influxdata repo).
smartctl output:
# smartctl --info --health --attributes --tolerance=verypermissive -n standby --format=brief /dev/sdg
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-46-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUH721212AL5204
Revision: C3Q1
Compliance: SPC-4
User Capacity: 12,000,138,625,024 bytes [12.0 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca27076bfe8
Serial number: 8HJ39K3H
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Thu Apr 18 13:25:03 2019 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 34 C
Drive Trip Temperature: 85 C
Manufactured in week 35 of year 2018
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 7
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 39
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 544135446528
(same for sdc - it has 60 drives from sdc to sdbj) sda and sdb are SATA drives and I can get their status via telegraf.
# smartctl --info --health --attributes --tolerance=verypermissive -n standby --format=brief /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-46-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi/HGST Travelstar Z7K500
Device Model: HGST HTE725050A7E630
Serial Number: RCE50G20G81S9S
LU WWN Device Id: 5 000cca 90bc3a98b
Firmware Version: GS2OA3E0
User Capacity: 500,107,862,016 bytes [500 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 2.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Thu Apr 18 13:27:51 2019 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Power mode is: ACTIVE or IDLE
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 062 - 0
2 Throughput_Performance P-S--- 100 100 040 - 0
3 Spin_Up_Time POS--- 100 100 033 - 1
4 Start_Stop_Count -O--C- 100 100 000 - 4
5 Reallocated_Sector_Ct PO--CK 100 100 005 - 0
7 Seek_Error_Rate PO-R-- 100 100 067 - 0
8 Seek_Time_Performance P-S--- 100 100 040 - 0
9 Power_On_Hours -O--C- 099 099 000 - 743
10 Spin_Retry_Count PO--C- 100 100 060 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 4
191 G-Sense_Error_Rate -O-R-- 100 100 000 - 0
192 Power-Off_Retract_Count -O--CK 100 100 000 - 2
193 Load_Cycle_Count -O--C- 100 100 000 - 13
194 Temperature_Celsius -O---- 250 250 000 - 24 (Min/Max 15/29)
196 Reallocated_Event_Count -O--CK 100 100 000 - 0
197 Current_Pending_Sector -O---K 100 100 000 - 0
198 Offline_Uncorrectable ---R-- 100 100 000 - 0
199 UDMA_CRC_Error_Count -O-R-- 200 200 000 - 0
223 Load_Retry_Count -O-R-- 100 100 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
Feature Request
Smart input plugin can't read some disks
Proposal:
Use smartctl -H for disk status
Current behavior:
no info about hitachi disks at all
Desired behavior:
at least I want smart overall status
Smart is looks like this one:
And with -H I can get a standart output: