FOGProject / fos

FOG Operating System
30 stars 33 forks source link

Add NVMe drive inventory information #72

Closed Sebastian-Roth closed 11 months ago

Sebastian-Roth commented 1 year ago

So far normal HDD and SSD drive inventory information is properly retrieved due to using hdparm: https://github.com/FOGProject/fos/blob/8b299922a047946b3a33364d12575c763292b85a/Buildroot/board/FOG/FOS/rootfs_overlay/usr/share/fog/lib/funcs.sh#L106

Should be able to get the same information for NVMe drives using nvme cli tool or smartctl.

Forum topic: https://forums.fogproject.org/topic/16740/host-hardware-inventory-hard-disk-model-m-2-nvme-not-identify

AlexPdx007 commented 1 year ago

hdparm -i /dev/nvme0n1 /dev/nvme0n1: HDIO_GET_IDENTITY failed: Inappropriate ioctl for device

And :

smartctl --info /dev/nvme0n1 | grep Model returned: Model Number: KINGSTON SNV2S250G

I'm still connected with SSH on the host and await further instructions :-) ...just tell me what to type and your wish is my command :)

mastacontrola commented 1 year ago

Should be able to do something a little complex, but single line:

hdinfo=$(hdparm -i $hd 2>/dev/null || nvme id-ctrl $hd | awk  '/mn[ ]+:/ {split($0, model, ": "); modelno = model[2]} /sn[ ]+:/ {split($0, serial, ": "); serialno = serial[2]} /fr[ ]+:/ {split($0, firmware, ": "); fwrev = firmware[2]} END {gsub("^[[:space:]]+|[[:space:]]+$", "", modelno);gsub("^[[:space:]]+|[[:space:]]+$", "", fwrev);gsub("^[[:space:]]+|[[:space:]]+$","",serialno);print "model="modelno",fwrev="fwrev",serialno="serialno}')

This should present the data in a similar format as the hdparm though I'd love confirmation if at all possible (from a SATA/ATA available device)

Just trying to make things a little easier in general and commonize the output where possible.

AlexPdx007 commented 1 year ago

Should be able to do something a little complex, but single line:

hdinfo=$(hdparm -i $hd 2>/dev/null || nvme id-ctrl $hd | awk  '/mn[ ]+:/ {split($0, model, ": "); modelno = model[2]} /sn[ ]+:/ {split($0, serial, ": "); serialno = serial[2]} /fr[ ]+:/ {split($0, firmware, ": "); fwrev = firmware[2]} END {gsub("^[[:space:]]+|[[:space:]]+$", "", modelno);gsub("^[[:space:]]+|[[:space:]]+$", "", fwrev);gsub("^[[:space:]]+|[[:space:]]+$","",serialno);print "model="modelno",fwrev="fwrev",serialno="serialno}')

This should present the data in a similar format as the hdparm though I'd love confirmation if at all possible (from a SATA/ATA available device)

Just trying to make things a little easier in general and commonize the output where possible.

i get this :

d-ctrl: Invalid argument Send an Identify Controller command to the given device and report information about the specified controller in human-readable or binary format. May also return vendor-specific controller attributes in hex-dump if requested. [ --vendor-specific, -v ] --- dump binary vendor field [ --output-format=, -o ] --- Output format: normal|json|binary [ --raw-binary, -b ] --- show identify in binary format [ --human-readable, -H ] --- show identify in readable format

mastacontrola commented 1 year ago

seem's you typed the id-ctrl part?

I'm just solutioning on the fly.

What I'm looking for is actual output of hdparm -i /dev/sda (or similar) so I know how to formulate the nvme awk output.

AlexPdx007 commented 1 year ago

I must excuse myself because i'm a NOB at linux commands :( ...but im taking it step by step and doing my best...so :

nvme list

Node SN Model Namespace Usage Format FW Rev

/dev/nvme0n1 50026B7685F13C1E KINGSTON SNV2S250G 250.06 GB / 250.06 GB 512 B + 0 B ELFK0S.4

nvme id-ctrl dev/nvme0n1

NVME Identify Controller: vid : 0x2646 ssvid : 0x2646 sn : 50026B7685F13C1E mn : KINGSTON SNV2S250G fr : ELFK0S.4 rab : 4 ieee : 0026b7 cmic : 0 mdts : 6 cntlid : 0 ver : 0x10400 rtd3r : 0x186a0 rtd3e : 0x4c4b40 oaes : 0 ctratt : 0x2 rrls : 0 cntrltype : 1 fguid : crdt1 : 0 crdt2 : 0 crdt3 : 0 oacs : 0x17 acl : 3 aerl : 3 frmw : 0x12 lpa : 0x1e elpe : 62 npss : 4 avscc : 0x1 apsta : 0x1 wctemp : 350 cctemp : 352 mtfa : 100 hmpre : 16384 hmmin : 16384 tnvmcap : 250059350016 unvmcap : 0 rpmbs : 0 edstt : 30 dsto : 0 fwug : 4 kas : 0 hctma : 0x1 mntmt : 273 mxtmt : 350 sanicap : 0xa0000002 hmminds : 1024 hmmaxd : 16 nsetidmax : 0 endgidmax : 0 anatt : 0 anacap : 0 anagrpmax : 0 nanagrpid : 0 pels : 96 sqes : 0x66 cqes : 0x44 maxcmd : 256 nn : 1 oncs : 0x57 fuses : 0 fna : 0 vwc : 0x7 awun : 255 awupf : 0 nvscc : 1 nwpc : 0 acwu : 0 sgls : 0 mnan : 0 subnqn : nqn.2020-04.com.kingston:nvme:nvm-subsystem-sn-50026B7685F13C1E ioccsz : 0 iorcsz : 0 icdoff : 0 ctrattr : 0 msdbd : 0 ps 0 : mp:5.00W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:- active_power:- ps 1 : mp:2.40W operational enlat:0 exlat:0 rrt:1 rrl:1 rwt:1 rwl:1 idle_power:- active_power:- ps 2 : mp:1.90W operational enlat:0 exlat:0 rrt:2 rrl:2 rwt:2 rwl:2 idle_power:- active_power:- ps 3 : mp:0.0500W non-operational enlat:3000 exlat:2000 rrt:3 rrl:3 rwt:3 rwl:3 idle_power:- active_power:- ps 4 : mp:0.0035W non-operational enlat:10000 exlat:40000 rrt:4 rrl:4 rwt:4 rwl:4 idle_power:- active_power:-

nvme id-ctrl /dev/nvme0n1 -human-readable

NVME Identify Controller: vid : 0x2646 ssvid : 0x2646 sn : 50026B7685F13C1E mn : KINGSTON SNV2S250G fr : ELFK0S.4 rab : 4 ieee : 0026b7 cmic : 0 [3:3] : 0 ANA not supported [2:2] : 0 PCI [1:1] : 0 Single Controller [0:0] : 0 Single Port

mdts : 6 cntlid : 0 ver : 0x10400 rtd3r : 0x186a0 rtd3e : 0x4c4b40 oaes : 0 [14:14] : 0 Endurance Group Event Aggregate Log Page Change Notice Not Supported [13:13] : 0 LBA Status Information Notices Not Supported [12:12] : 0 Predictable Latency Event Aggregate Log Change Notices Not Supported [11:11] : 0 Asymmetric Namespace Access Change Notices Not Supported [9:9] : 0 Firmware Activation Notices Not Supported [8:8] : 0 Namespace Attribute Changed Event Not Supported

ctratt : 0x2 [9:9] : 0 UUID List Not Supported [7:7] : 0 Namespace Granularity Not Supported [5:5] : 0 Predictable Latency Mode Not Supported [4:4] : 0 Endurance Groups Not Supported [3:3] : 0 Read Recovery Levels Not Supported [2:2] : 0 NVM Sets Not Supported [1:1] : 0x1 Non-Operational Power State Permissive Supported [0:0] : 0 128-bit Host Identifier Not Supported

rrls : 0 cntrltype : 1 [7:2] : 0 Reserved [1:0] : 0x1 I/O Controller fguid : crdt1 : 0 crdt2 : 0 crdt3 : 0 oacs : 0x17 [9:9] : 0 Get LBA Status Capability Not Supported [8:8] : 0 Doorbell Buffer Config Not Supported [7:7] : 0 Virtualization Management Not Supported [6:6] : 0 NVMe-MI Send and Receive Not Supported [5:5] : 0 Directives Not Supported [4:4] : 0x1 Device Self-test Supported [3:3] : 0 NS Management and Attachment Not Supported [2:2] : 0x1 FW Commit and Download Supported [1:1] : 0x1 Format NVM Supported [0:0] : 0x1 Security Send and Receive Supported

acl : 3 aerl : 3 frmw : 0x12 [4:4] : 0x1 Firmware Activate Without Reset Supported [3:1] : 0x1 Number of Firmware Slots [0:0] : 0 Firmware Slot 1 Read/Write

lpa : 0x1e [4:4] : 0x1 Persistent Event log Supported [3:3] : 0x1 Telemetry host/controller initiated log page Supported [2:2] : 0x1 Extended data for Get Log Page Supported [1:1] : 0x1 Command Effects Log Page Supported [0:0] : 0 SMART/Health Log Page per NS Not Supported

elpe : 62 npss : 4 avscc : 0x1 [0:0] : 0x1 Admin Vendor Specific Commands uses NVMe Format

apsta : 0x1 [0:0] : 0x1 Autonomous Power State Transitions Supported

wctemp : 350 cctemp : 352 mtfa : 100 hmpre : 16384 hmmin : 16384 tnvmcap : 250059350016 unvmcap : 0 rpmbs : 0 [31:24]: 0 Access Size [23:16]: 0 Total Size [5:3] : 0 Authentication Method [2:0] : 0 Number of RPMB Units

edstt : 30 dsto : 0 fwug : 4 kas : 0 hctma : 0x1 [0:0] : 0x1 Host Controlled Thermal Management Supported

mntmt : 273 mxtmt : 350 sanicap : 0xa0000002 [31:30] : 0x2 Media is additionally modified after sanitize operation completes successfully [29:29] : 0x1 No-Deallocate After Sanitize bit in Sanitize command Not Supported [2:2] : 0 Overwrite Sanitize Operation Not Supported [1:1] : 0x1 Block Erase Sanitize Operation Supported [0:0] : 0 Crypto Erase Sanitize Operation Not Supported

hmminds : 1024 hmmaxd : 16 nsetidmax : 0 endgidmax : 0 anatt : 0 anacap : 0 [7:7] : 0 Non-zero group ID Not Supported [6:6] : 0 Group ID does not change [4:4] : 0 ANA Change state Not Supported [3:3] : 0 ANA Persistent Loss state Not Supported [2:2] : 0 ANA Inaccessible state Not Supported [1:1] : 0 ANA Non-optimized state Not Supported [0:0] : 0 ANA Optimized state Not Supported

anagrpmax : 0 nanagrpid : 0 pels : 96 sqes : 0x66 [7:4] : 0x6 Max SQ Entry Size (64) [3:0] : 0x6 Min SQ Entry Size (64)

cqes : 0x44 [7:4] : 0x4 Max CQ Entry Size (16) [3:0] : 0x4 Min CQ Entry Size (16)

maxcmd : 256 nn : 1 oncs : 0x57 [7:7] : 0 Verify Not Supported [6:6] : 0x1 Timestamp Supported [5:5] : 0 Reservations Not Supported [4:4] : 0x1 Save and Select Supported [3:3] : 0 Write Zeroes Not Supported [2:2] : 0x1 Data Set Management Supported [1:1] : 0x1 Write Uncorrectable Supported [0:0] : 0x1 Compare Supported

fuses : 0 [0:0] : 0 Fused Compare and Write Not Supported

fna : 0 [2:2] : 0 Crypto Erase Not Supported as part of Secure Erase [1:1] : 0 Crypto Erase Applies to Single Namespace(s) [0:0] : 0 Format Applies to Single Namespace(s)

vwc : 0x7 [2:1] : 0x3 The Flush command supports NSID set to FFFFFFFFh [0:0] : 0x1 Volatile Write Cache Present

awun : 255 awupf : 0 nvscc : 1 [0:0] : 0x1 NVM Vendor Specific Commands uses NVMe Format

nwpc : 0 [2:2] : 0 Permanent Write Protect Not Supported [1:1] : 0 Write Protect Until Power Supply Not Supported [0:0] : 0 No Write Protect and Write Protect Namespace Not Supported

acwu : 0 sgls : 0 [1:0] : 0 Scatter-Gather Lists Not Supported

mnan : 0 subnqn : nqn.2020-04.com.kingston:nvme:nvm-subsystem-sn-50026B7685F13C1E ioccsz : 0 iorcsz : 0 icdoff : 0 ctrattr : 0 [0:0] : 0 Dynamic Controller Model

msdbd : 0 ps 0 : mp:5.00W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:- active_power:- ps 1 : mp:2.40W operational enlat:0 exlat:0 rrt:1 rrl:1 rwt:1 rwl:1 idle_power:- active_power:- ps 2 : mp:1.90W operational enlat:0 exlat:0 rrt:2 rrl:2 rwt:2 rwl:2 idle_power:- active_power:- ps 3 : mp:0.0500W non-operational enlat:3000 exlat:2000 rrt:3 rrl:3 rwt:3 rwl:3 idle_power:- active_power:- ps 4 : mp:0.0035W non-operational enlat:10000 exlat:40000 rrt:4 rrl:4 rwt:4 rwl:4 idle_power:- active_power:-

AlexPdx007 commented 1 year ago

...and perhaps this helps two in creating the correct command line :) :

smartctl --all /dev/nvme0

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.98] (local build)

=== START OF INFORMATION SECTION === Model Number: KINGSTON SNV2S250G Serial Number: 50026B7685F13C1E Firmware Version: ELFK0S.4 PCI Vendor/Subsystem ID: 0x2646 IEEE OUI Identifier: 0x0026b7 Total NVM Capacity: 250,059,350,016 [250 GB] Unallocated NVM Capacity: 0 Controller ID: 0 NVMe Version: 1.4 Number of Namespaces: 1 Namespace 1 Size/Capacity: 250,059,350,016 [250 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 0026b7 685f13c1e5 Local Time is: Fri Mar 17 13:42:42 2023 UTC Firmware Updates (0x12): 1 Slot, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp Log Page Attributes (0x1e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg Maximum Data Transfer Size: 64 Pages Warning Comp. Temp. Threshold: 77 Celsius Critical Comp. Temp. Threshold: 79 Celsius

Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 5.00W - - 0 0 0 0 0 0 1 + 2.40W - - 1 1 1 1 0 0 2 + 1.90W - - 2 2 2 2 0 0 3 - 0.0500W - - 3 3 3 3 3000 2000 4 - 0.0035W - - 4 4 4 4 10000 40000

Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 1 1 - 4096 0 0

=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 35 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 904,147 [462 GB] Data Units Written: 652,254 [333 GB] Host Read Commands: 9,783,575 Host Write Commands: 10,732,568 Controller Busy Time: 18 Power Cycles: 242 Power On Hours: 20 Unsafe Shutdowns: 21 Media and Data Integrity Errors: 0 Error Information Log Entries: 314 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 2: 49 Celsius

Error Information (NVMe Log 0x01, 16 of 63 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 314 0 0x201c 0xc005 0x028 0 0 - 1 313 0 0x4009 0xc005 0x028 0 0 - 2 312 0 0x200f 0xc004 0x028 0 0 - 3 311 0 0x200e 0xc004 0x028 0 0 - 4 310 0 0x8002 0xc005 0x028 0 0 -