Closed Ondjultomte closed 8 months ago
it seems that there are missing data! perhaps from when I deleted namespaces. I rescanned after each step though. How can I get my data back ?
When I started it looked like this
Node Generic SN Model Namespace Usage Format FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme3n1 /dev/ng3n1 PHOC217500MZ058A INTEL SSDPEK1A058GA 1 58,98 GB / 58,98 GB 512 B + 0 B U5110550
/dev/nvme2n1 /dev/ng2n1 210246465103 WDC WDS100T2B0C-00PXH0 1 1,00 TB / 1,00 TB 512 B + 0 B 211210WD
/dev/nvme1n1 /dev/ng1n1 PHLJ2276003G2P0BGN VO002000KWVVC 1 2,00 TB / 2,00 TB 4 KiB + 0 B 4ICRHPK3
/dev/nvme0n1 /dev/ng0n1 PHLJ227600232P0BGN VO002000KWVVC 1 2,00 TB / 2,00 TB 4 KiB + 0 B 4ICRHPK3
pve# nvme id-ctrl /dev/nvme0 | grep mcap
unvmcap : 0
grep: (standard input): binary file matches
pve# nvme id-ctrl /dev/nvme0n1 | grep mcap
unvmcap : 0
grep: (standard input): binary file matches
pve# nvme id-ctrl /dev/nvme0n1
NVME Identify Controller:
vid : 0x8086
ssvid : 0x1590
sn : PHLJ227600232P0BGN
mn : VO002000KWVVC
fr : 4ICRHPK3
rab : 0
ieee : 5cd2e4
cmic : 0
mdts : 5
cntlid : 0
ver : 0x10200
rtd3r : 0x1e8480
rtd3e : 0x2dc6c0
oaes : 0x200
ctratt : 0
rrls : 0
cntrltype : 0
fguid : 00000000-0000-0000-0000-000000000000
crdt1 : 0
crdt2 : 0
crdt3 : 0
nvmsr : 0
vwci : 0
mec : 1
oacs : 0xe
acl : 3
aerl : 3
frmw : 0x18
lpa : 0xe
elpe : 63
npss : 2
avscc : 0
apsta : 0
wctemp : 338
cctemp : 348
mtfa : 0
hmpre : 0
hmmin : 0
tnvmcap : 20���003���989���34016
unvmcap : 0
rpmbs : 0
edstt : 0
dsto : 0
fwug : 0
kas : 0
hctma : 0
mntmt : 0
mxtmt : 0
sanicap : 0
hmminds : 0
hmmaxd : 0
nsetidmax : 0
endgidmax : 0
anatt : 0
anacap : 0
anagrpmax : 0
nanagrpid : 0
pels : 0
domainid : 0
megcap : 0
sqes : 0x66
cqes : 0x44
maxcmd : 0
nn : 128
oncs : 0x6
fuses : 0
fna : 0
vwc : 0
awun : 0
awupf : 0
icsvscc : 0
nwpc : 0
acwu : 0
ocfs : 0
sgls : 0
mnan : 0
maxdna : 0
maxcna : 0
subnqn :
ioccsz : 0
iorcsz : 0
icdoff : 0
fcatt : 0
msdbd : 0
ofcs : 0
ps 0 : mp:14.00W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
active_power_workload:-
ps 1 : mp:10.00W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
active_power_workload:-
ps 2 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
active_power_workload:-```
So I tried to TS this, if there are unseen ns, I tried to delete them.
the status message back is not the same! that is odd.
pve# nvme detach-ns /dev/nvme0 -n 3 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme1 -n 3 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme1 -n 4 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 4 -c 0
NVMe status: Namespace Not Attached: The request to detach the controller could not be completed because the controller is not attached to the namespace(0x411a)
pve#
and list ns
pve# nvme list-ns /dev/nvme0
[ 0]:0x1
[ 1]:0x2
pve# nvme list-ns /dev/nvme1
[ 0]:0x1
[ 1]:0x2
What can I do ? Have i broken the drives ?
Iirc, this controller (Cliffdale) requires namespaces be created and deleted in order, otherwise it won't have a sufficient extent for unallocated space. It's not spec compliant behavior, but it's just quirky behavior out of the device.
So can I delete all ns and restart?
Any specific order required?
Yah, if you delete everything, then you should have full capacity to make new namespaces.
Btw if I wanted to change size of two of four ns, do I need to remove all ns and restart, or can I remove one at a time , ie first remove the last ns etc... will this reclaim the space for the ns removed/deleted and then I can create new ns ?
I'm not sure. My understanding is that capacity from deleted namespaces won't be available for new namespaces until you delete all namespaces created before it. But I don't have any of these anymore so I can't test that.
ok I need to test it if I have the time. Ill report back with results. but from my above testing it would seem likely that you need to start fram scratch.
I needed to do some adjustment and started to delete the ns 5 to 1.
pve# nvme list
Node Generic SN Model Namespace Usage Format FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme3n1 /dev/ng3n1 PHOC217500MZ058A INTEL SSDPEK1A058GA 1 58,98 GB / 58,98 GB 512 B + 0 B U5110550
/dev/nvme2n1 /dev/ng2n1 210246465103 WDC WDS100T2B0C-00PXH0 1 1,00 TB / 1,00 TB 512 B + 0 B 211210WD
pve# smartctl -x /dev/nvme0 | grep Capa
Total NVM Capacity: 2 000 398 934 016 [2,00 TB]
Unallocated NVM Capacity: 255 568 470 016 [255 GB]
pve# smartctl -x /dev/nvme1 | grep Capa
Total NVM Capacity: 2 000 398 934 016 [2,00 TB]
Unallocated NVM Capacity: 2 000 398 934 016 [2,00 TB]
pve#
nvme0 still reports utilization ... really buggy
tried some resets, see if that helps. but no..
pve# nvme reset /dev/nvme0
pve# nvme subsystem-reset /dev/nvme0
Subsystem-reset: NVM Subsystem Reset not supported.
pve# smartctl -x /dev/nvme0 | grep Capa
Total NVM Capacity: 2 000 398 934 016 [2,00 TB]
Unallocated NVM Capacity: 255 568 470 016 [255 GB]
rebooted , but there is still only 255GB left on the drive now...
What can I do with this drive ? the ns seems to be stuck, there are none but still capacity is allocated.
pve# nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid : 0x8086
ssvid : 0x1590
sn : PHLJ227600232P0BGN
mn : VO002000KWVVC
fr : 4ICRHPK3
rab : 0
ieee : 5cd2e4
cmic : 0
mdts : 5
cntlid : 0
ver : 0x10200
rtd3r : 0x1e8480
rtd3e : 0x2dc6c0
oaes : 0x200
ctratt : 0
rrls : 0
cntrltype : 0
fguid : 00000000-0000-0000-0000-000000000000
crdt1 : 0
crdt2 : 0
crdt3 : 0
nvmsr : 0
vwci : 0
mec : 1
oacs : 0xe
acl : 3
aerl : 3
frmw : 0x18
lpa : 0xe
elpe : 63
npss : 2
avscc : 0
apsta : 0
wctemp : 338
cctemp : 348
mtfa : 0
hmpre : 0
hmmin : 0
tnvmcap : 20���003���989���34016
unvmcap : 2���555���684���70016
rpmbs : 0
edstt : 0
dsto : 0
fwug : 0
kas : 0
hctma : 0
mntmt : 0
mxtmt : 0
sanicap : 0
hmminds : 0
hmmaxd : 0
nsetidmax : 0
endgidmax : 0
anatt : 0
anacap : 0
anagrpmax : 0
nanagrpid : 0
pels : 0
domainid : 0
megcap : 0
sqes : 0x66
cqes : 0x44
maxcmd : 0
nn : 128
oncs : 0x6
fuses : 0
fna : 0
vwc : 0
awun : 0
awupf : 0
icsvscc : 0
nwpc : 0
acwu : 0
ocfs : 0
sgls : 0
mnan : 0
maxdna : 0
maxcna : 0
subnqn :
ioccsz : 0
iorcsz : 0
icdoff : 0
fcatt : 0
msdbd : 0
ofcs : 0
ps 0 : mp:14.00W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
active_power_workload:-
ps 1 : mp:10.00W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
active_power_workload:-
ps 2 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
active_power_workload:-
Is there some namespaces allocated but detached? There's an identify option to list detached namespaces. Don't know it off the top of my head (writing this from a phone).
any of these?
pve# nvme | grep list list List all NVMe devices and namespaces on machine list-subsys List nvme subsystems list-ns Send NVMe Identify List, display structure list-ctrl Send NVMe Identify Controller List, display structure list-secondary List Secondary Controllers associated with a Primary Controller list-endgrp Send NVMe Identify Endurance Group List, display structure changed-ns-list-log Retrieve Changed Namespace List, show it supported-cap-config-log Retrieve the list of Supported Capacity Configuration Descriptors
pve# nvme list-ns /dev/nvme0 pve# nvme list-ns /dev/nvme2 pve#
Is there some namespaces allocated but detached? There's an identify option to list detached namespaces. Don't know it off the top of my head (writing this from a phone).
I tried deleteing all ns number that could have been allocated but detached just in case.
I created totally 5 ns, so should only be 1-5.
pve# nvme detach-ns /dev/nvme0 -n 5 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 4 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 3 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 2 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 1 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 6 -c 0
NVMe status: Namespace Not Attached: The request to detach the controller could not be completed because the controller is not attached to the namespace(0x411a)
pve# nvme detach-ns /dev/nvme0 -n 7 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 8 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve# nvme detach-ns /dev/nvme0 -n 9 -c 0
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
pve#
It's an option on the list-ns I think (run it with --help). The default just lists attached namespaces, the special argument will list all namespaces.
It's an option on the list-ns I think (run it with --help). The default just lists attached namespaces, the special argument will list all namespaces.
pve# nvme list-ns
list-ns: Invalid argument
Usage: nvme list-ns
For the specified controller handle, show the namespace list in the associated NVMe subsystem, optionally starting with a given nsid.
Options:
[ --namespace-id=<NUM>, -n <NUM> ] --- first nsid returned list should
start from
[ --csi=<NUM>, -y <NUM> ] --- I/O command set identifier
[ --all, -a ] --- show all namespaces in the
subsystem, whether attached or
inactive
[ --output-format=<FMT>, -o <FMT> ] --- Output format: normal|json
pve# nvme list-ns -all /dev/nvme0
[ 0]:0x6
pve#
Yah, list-ns --all. If that still doesn't show anything, then I'm out of ideas.
showed this:
[ 0]:0x6
compared to the nvme that has all free space that shows nothing.
Oh, it shows namespace 6 is detached. If you delete namespace 6, does that recover your missing capacity?
oh man! it was a 6th ns !
now all space is back. Cant thank you enough for your help!
Im trying to add namespaced to my 2TB drives
worked fine for the first two name spaces
ve# smartctl -x /dev/nvme0 smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.5.11-7-pve] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION === Model Number: VO002000KWVVC Serial Number: PHLJ227600232P0BGN Firmware Version: 4ICRHPK3 PCI Vendor ID: 0x8086 PCI Vendor Subsystem ID: 0x1590 IEEE OUI Identifier: 0x5cd2e4 Total NVM Capacity: 2 000 398 934 016 [2,00 TB] Unallocated NVM Capacity: 282 412 015 616 [282 GB] Controller ID: 0 NVMe Version: 1.2 Number of Namespaces: 128 Local Time is: Sat Jan 27 00:33:20 2024 CET Firmware Updates (0x18): 4 Slots, no Reset required Optional Admin Commands (0x000e): Format Frmw_DL NS_Mngmt Optional NVM Commands (0x0006): Wr_Unc DS_Mngmt Log Page Attributes (0x0e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 32 Pages Warning Comp. Temp. Threshold: 65 Celsius Critical Comp. Temp. Threshold: 75 Celsius
Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 14.00W - - 0 0 0 0 0 0 1 + 10.00W - - 0 0 0 0 0 0 2 + 9.00W - - 0 0 0 0 0 0
=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 37 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 91 [46,5 MB] Data Units Written: 0 Host Read Commands: 1 360 Host Write Commands: 0 Controller Busy Time: 0 Power Cycles: 10 Power On Hours: 168 Unsafe Shutdowns: 7 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0
Error Information (NVMe Log 0x01, 16 of 64 entries) No Errors Logged
pve#
pve# smartctl -x /dev/nvme1 smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.5.11-7-pve] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION === Model Number: VO002000KWVVC Serial Number: PHLJ2276003G2P0BGN Firmware Version: 4ICRHPK3 PCI Vendor ID: 0x8086 PCI Vendor Subsystem ID: 0x1590 IEEE OUI Identifier: 0x5cd2e4 Total NVM Capacity: 2 000 398 934 016 [2,00 TB] Unallocated NVM Capacity: 389 786 198 016 [389 GB] Controller ID: 0 NVMe Version: 1.2 Number of Namespaces: 128 Local Time is: Sat Jan 27 00:36:30 2024 CET Firmware Updates (0x18): 4 Slots, no Reset required Optional Admin Commands (0x000e): Format Frmw_DL NS_Mngmt Optional NVM Commands (0x0006): Wr_Unc DS_Mngmt Log Page Attributes (0x0e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 32 Pages Warning Comp. Temp. Threshold: 65 Celsius Critical Comp. Temp. Threshold: 75 Celsius
Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 14.00W - - 0 0 0 0 0 0 1 + 10.00W - - 0 0 0 0 0 0 2 + 9.00W - - 0 0 0 0 0 0
=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 35 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 33 [16,8 MB] Data Units Written: 0 Host Read Commands: 618 Host Write Commands: 0 Controller Busy Time: 0 Power Cycles: 10 Power On Hours: 168 Unsafe Shutdowns: 7 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0
Error Information (NVMe Log 0x01, 16 of 64 entries No Errors Logged