nikimaxim / zbx-smartmonitor

Zabbix template for monitoring disk attributes
GNU General Public License v3.0
38 stars 14 forks source link

Second NVMe doesn't show up #9

Closed lllDez closed 3 years ago

lllDez commented 3 years ago

Hello!

Can you help me to understand, why second NVMe disk doesn't show up in the script

When i'm scanning with --scan-open command, i'm seeing this:

C:\Windows\system32>"C:\Program Files\smartmontools\bin\smartctl.exe" --scan-open /dev/sda -d scsi # /dev/sda, SCSI device /dev/sdb -d scsi # /dev/sdb, SCSI device /dev/csmi3,2 -d ata # /dev/csmi3,2, ATA device /dev/csmi3,3 -d ata # /dev/csmi3,3, ATA device /dev/nvme0 -d nvme # /dev/nvme0, NVMe device /dev/nvme1 -d nvme # /dev/nvme1, NVMe device

But when i'm scan my discs with script, nvme1 for some reason doesn't show up in the results:

C:\Windows\system32>powershell -NoProfile -ExecutionPolicy Bypass -File "C:\Zabbix\smartctl-storage-discovery.ps1" { "data":[ { "{#STORAGE.SN}":"BTYF91640xxxxxxxx", "{#STORAGE.MODEL}":"INTEL SSDSC2KB960G8", "{#STORAGE.NAME}":"/dev/csmi3,2", "{#STORAGE.CMD}":"/dev/csmi3,2 -data", "{#STORAGE.SMART}":"1", "{#STORAGE.TYPE}":"1" }, { "{#STORAGE.SN}":"BTYF91640xxxxxxx", "{#STORAGE.MODEL}":"INTEL SSDSC2KB960G8", "{#STORAGE.NAME}":"/dev/csmi3,3", "{#STORAGE.CMD}":"/dev/csmi3,3 -data", "{#STORAGE.SMART}":"1", "{#STORAGE.TYPE}":"1" }, { "{#STORAGE.SN}":"69B0Axxxxxxx", "{#STORAGE.MODEL}":"KCM51VUG800G", "{#STORAGE.NAME}":"/dev/nvme0", "{#STORAGE.CMD}":"/dev/nvme0 -dnvme", "{#STORAGE.SMART}":"1", "{#STORAGE.TYPE}":"2" } ] }

nikimaxim commented 3 years ago

@lllDez Hello Send the output of the commands:

smartctl.exe -a /dev/nvme0 -d nvme
smartctl.exe -a /dev/nvme1 -d nvme
lllDez commented 3 years ago

@lllDez Hello Send the output of the commands:

smartctl.exe -a /dev/nvme0 -d nvme
smartctl.exe -a /dev/nvme1 -d nvme

Sorry that i'm bothered you, but it seems like smartctl just can't see second nvme throught Intel vroc and it shows same ssd twice

Here's output just in case:

C:\Program Files\smartmontools\bin>smartctl.exe -a /dev/nvme0 -d nvme smartctl 7.2 2020-12-30 r5155 [x86_64-w64-mingw32-2016] (sf-7.2-1) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION === Model Number: KCM51VUG800G Serial Number: 69B0A001TVSE Firmware Version: 0105 PCI Vendor/Subsystem ID: 0x1179 IEEE OUI Identifier: 0x8ce38e Total NVM Capacity: 800 166 076 416 [800 GB] Unallocated NVM Capacity: 0 Controller ID: 1 NVMe Version: 1.3 Number of Namespaces: 16 Local Time is: Wed Aug 11 20:26:02 2021 RTZST Firmware Updates (0x14): 2 Slots, no Reset required Optional Admin Commands (0x001f): Security Format Frmw_DL NS_Mngmt Self_Test Optional NVM Commands (0x007e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Resv Timestmp Log Page Attributes (0x0e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 8192 Pages Warning Comp. Temp. Threshold: 71 Celsius Critical Comp. Temp. Threshold: 77 Celsius

Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 19.80W 18.00W - 0 0 0 0 0 0 1 + 17.60W 16.00W - 0 0 1 1 0 0 2 + 15.40W 14.00W - 0 0 2 2 0 0 3 + 12.10W 11.00W - 1 1 3 3 0 0 4 + 9.90W 9.00W - 2 2 4 4 0 0

=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 43 Celsius Available Spare: 100% Available Spare Threshold: 11% Percentage Used: 1% Data Units Read: 41 959 968 [21,4 TB] Data Units Written: 31 326 749 [16,0 TB] Host Read Commands: 226 486 398 Host Write Commands: 438 956 234 Controller Busy Time: 1 647 Power Cycles: 21 Power On Hours: 9 194 Unsafe Shutdowns: 10 Media and Data Integrity Errors: 0 Error Information Log Entries: 437 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0

Error Information (NVMe Log 0x01, 16 of 256 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 437 0 0x0022 0xc004 0x004 - 16 - 1 436 0 0x0020 0xc004 0x004 - 15 - 2 435 0 0x001e 0xc004 0x004 - 14 - 3 434 0 0x001c 0xc004 0x004 - 13 - 4 433 0 0x001a 0xc004 0x004 - 12 - 5 432 0 0x0018 0xc004 0x004 - 11 - 6 431 0 0x0016 0xc004 0x004 - 10 - 7 430 0 0x0014 0xc004 0x004 - 9 - 8 429 0 0x0012 0xc004 0x004 - 8 - 9 428 0 0x0010 0xc004 0x004 - 7 - 10 427 0 0x000e 0xc004 0x004 - 6 - 11 426 0 0x000c 0xc004 0x004 - 5 - 12 425 0 0x000a 0xc004 0x004 - 4 - 13 424 0 0x0008 0xc004 0x004 - 3 - 14 423 0 0x0006 0xc004 0x004 - 2 -

C:\Program Files\smartmontools\bin>smartctl.exe -a /dev/nvme1 -d nvme smartctl 7.2 2020-12-30 r5155 [x86_64-w64-mingw32-2016] (sf-7.2-1) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION === Model Number: KCM51VUG800G Serial Number: 69B0A001TVSE Firmware Version: 0105 PCI Vendor/Subsystem ID: 0x1179 IEEE OUI Identifier: 0x8ce38e Total NVM Capacity: 800 166 076 416 [800 GB] Unallocated NVM Capacity: 0 Controller ID: 1 NVMe Version: 1.3 Number of Namespaces: 16 Local Time is: Wed Aug 11 20:26:16 2021 RTZST Firmware Updates (0x14): 2 Slots, no Reset required Optional Admin Commands (0x001f): Security Format Frmw_DL NS_Mngmt Self_Test Optional NVM Commands (0x007e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Resv Timestmp Log Page Attributes (0x0e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 8192 Pages Warning Comp. Temp. Threshold: 71 Celsius Critical Comp. Temp. Threshold: 77 Celsius

Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 19.80W 18.00W - 0 0 0 0 0 0 1 + 17.60W 16.00W - 0 0 1 1 0 0 2 + 15.40W 14.00W - 0 0 2 2 0 0 3 + 12.10W 11.00W - 1 1 3 3 0 0 4 + 9.90W 9.00W - 2 2 4 4 0 0

=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 43 Celsius Available Spare: 100% Available Spare Threshold: 11% Percentage Used: 1% Data Units Read: 41 959 968 [21,4 TB] Data Units Written: 31 326 753 [16,0 TB] Host Read Commands: 226 486 398 Host Write Commands: 438 956 376 Controller Busy Time: 1 647 Power Cycles: 21 Power On Hours: 9 194 Unsafe Shutdowns: 10 Media and Data Integrity Errors: 0 Error Information Log Entries: 437 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0

Error Information (NVMe Log 0x01, 16 of 256 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 437 0 0x0022 0xc004 0x004 - 16 - 1 436 0 0x0020 0xc004 0x004 - 15 - 2 435 0 0x001e 0xc004 0x004 - 14 - 3 434 0 0x001c 0xc004 0x004 - 13 - 4 433 0 0x001a 0xc004 0x004 - 12 - 5 432 0 0x0018 0xc004 0x004 - 11 - 6 431 0 0x0016 0xc004 0x004 - 10 - 7 430 0 0x0014 0xc004 0x004 - 9 - 8 429 0 0x0012 0xc004 0x004 - 8 - 9 428 0 0x0010 0xc004 0x004 - 7 - 10 427 0 0x000e 0xc004 0x004 - 6 - 11 426 0 0x000c 0xc004 0x004 - 5 - 12 425 0 0x000a 0xc004 0x004 - 4 - 13 424 0 0x0008 0xc004 0x004 - 3 - 14 423 0 0x0006 0xc004 0x004 - 2 -

nikimaxim commented 3 years ago

@lllDez Yes. You have two conclusions about the same device the script(smartctl-storage-discovery) removes duplicate devices by SN(Serial Number: 69B0A001TVSE)