linux-nvme / libnvme

C Library for NVM Express on Linux
GNU Lesser General Public License v2.1
174 stars 130 forks source link

"could not alloc memory for discovery log page" and mount failure with Corsair 8TB drive over RDMA nvmet #807

Open davidrohr opened 8 months ago

davidrohr commented 8 months ago

I am posting here as libnvme issue, but I am not really sure if the problem is in libnvme, nvme-cli, the kernel or somewhere else.

I have 2 computers, pc1 and pc2, connected via ConnectX3 Infiniband adapters, with pc1 mounting 3 remote nvme SSDs from pc2: a samsung 990 Pro 2TB and 2 Corsair MP600 Pro 8TB.

I am having 2 problems:

I tried the following things already:

So in summary, everything works when connecting via tcp, but when connecting via rdma the nvme discovery fails since the numrec entry is bogus, and with rdma I can mount filesystems from the 2TB samsung SSD correctly, but filesystems from the 8TB corsair SSDs fail to mount, despite the data I am reading from the disk with dd is fully correct.

I would be thankful for any advise, or recommendation where to ask. I you'd need me to conduct any tests, please let me know.

martin-belanger commented 8 months ago

Not sure this will make any difference for RDMA. I'm more familiar with TCP.

For TCP, the Service ID used for Discovery is typically 8009. And the Service ID for I/O is 4420. You could try nvme discover -t rdma -a [IP] -s 8009 to see if that makes a difference.

igaw commented 8 months ago

This looks like the RMDA transport is having an issue. libnvme/nvme-cli doesn't differentiate between the transports. Also the nvme subsystem core doesn't differentiate between the transport.

What you could do is to dump the discovery log page for TCP and RDMA (nvme get-log). This would give us a clue. Also try to add '-vvv' when doing the nvme discover command. This also sheds some light on nvme_discovery_log

FWIW, Linux's nvmet implementation (aka soft target) doesn't implement the discovery controller on port 8009.

davidrohr commented 8 months ago

For the service ID: I am explicity setting this when I set up the nvmet on the server. I have changed it to 8009, but it doesn't make any difference. Also, I am quite sure it connects to the right service ID, since when the ID differs between client and server, nvme discover / connect complain that they cannot connect.

I ran nvme discover -vvv [...]. The logs are attached, but don't tell anything to me. logs.tar.gz

Could you be more specific which log page to obtain with nvme get-log, i.e. which log-id and how many bytes. I could not find any information which one is the discovery log.

igaw commented 8 months ago

The logs show that the initial get header already is corrupted for RDMA. So it's not that the loop is going wrong.

TCP:

nvme3: get header (try 0/10)
nvme3: get 4 records (genctr 9)
nvme3: get header again

RDMA:

nvme3: get header (try 0/10)
could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory

I just realized we need a block device in order to run the get-log command. So you need first to connected to the discovery controller and then do a nvme get-log:

# nvme connect -t rdma -a XXX -s 4420 -n nqn.2014-08.org.nvmexpress.discovery
[ get the nvme decvice from dmesg, it wont show up in nvme list]
# nvme get-log /dev/nvme$X -i 0x70 -l 4096
davidrohr commented 8 months ago

The logs show that the initial get header already is corrupted for RDMA. So it's not that the loop is going wrong.

Yes, as I wrote in the original post, the numrec field in the header has a bogus value. Thus it computes a very large size to allocate for the log page, and fails the allocation.

# nvme get-log /dev/nvme$X -i 0x70 -l 4096

thx, the 0x70 was what I was missing. Will get that tonight connecting to the server both via TCP and via RDMA, and will post the logs here.

Thx for the quick support!

igaw commented 8 months ago

I was just making sure the loop works. We had some bugs in this loop thus I am a bit paranoid.

FWIW, in the base spec the output is specified in 5.16.1.23 Discovery Log Page (Log Page Identifier 70h)

davidrohr commented 8 months ago

I cannot get a log for id 70h / 112:

~ # nvme get-log /dev/nvme0n1 --log-id=112 --log-len 4096
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len 4096
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)

While e.g. obtaining the log for id 2 (smart) works, so the nvme get-log is working in general, just not for id 70h.

igaw commented 8 months ago

Are you sure you connected to the discovery controller. A discovery controller doesn't have name spaces. So it's important to use the correct NQN: -n nqn.2014-08.org.nvmexpress.discovery

hreinecke commented 8 months ago

The header is not necessarily corrupted. We never evaluate the CQE status when calling nvme_get_log_page(); it surely will return an errno when the ioctl fails, but unfortunately an ioctl success just means that we got an CQE correctly, not that the CQE does not have an error status ...

hreinecke commented 8 months ago

Check if https://github.com/linux-nvme/libnvme/pull/810 helps here.

davidrohr commented 8 months ago

-n nqn.2014-08.org.nvmexpress.discovery

I am not fully sure what to do :(:

~ # nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme0n1          /dev/ng0n1            6722de001c9e5e3af9e8 Linux                                    0x1          8.00  TB /   8.00  TB    512   B +  0 B   6.8.2-ge
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len=512 -n 1
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len=512 -n nqn.2014-08.org.nvmexpress.discovery
Expected word argument for 'namespace-id' but got 'nqn.2014-08.org.nvmexpress.discovery'!
~ # nvme get-log /dev/nvme0n1 --log-id=70h --log-len=512
NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x6002)

I have done nvme connect, and the device is there. I can read it with dd, but mounting works only when I connect via tcp, not rdma. Where shall I use -n nqn.2014-08.org.nvmexpress.discovery? As you see above, nvme get-log doesn't accept that option, it expects a namespace. If I put in the namespace 1 of the device, it still doesn't work.

hreinecke commented 8 months ago

'discovery' works without a controller connected. You could try with

nvme discover -t rdma -a [IP] -s 4420
davidrohr commented 8 months ago

'discovery' works without a controller connected. You could try with

nvme discover -t rdma -a [IP] -s 4420

But this is what I did before, see the logs here: https://github.com/linux-nvme/libnvme/issues/807#issuecomment-2033022371 It works if I connect over tcp, but over rdma it fails with

could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory
davidrohr commented 8 months ago

Check if #810 helps here.

I tried with this PR on top of master, but I am still getting:

~ # nvme discover -t rdma -a 192.168.110.9 -s 4421
could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory

I deployed the new version only on the client so far, do I also need to update it on the server?

hreinecke commented 8 months ago

Can you try with this patch:

diff --git a/src/nvme/fabrics.c b/src/nvme/fabrics.c
index 96ba9d4b..44a8ac62 100644
--- a/src/nvme/fabrics.c
+++ b/src/nvme/fabrics.c
@@ -1106,7 +1106,7 @@ retry_header:
                log = __nvme_alloc(sizeof(*log) + entries_size);
                if (!log) {
                        nvme_msg(r, LOG_ERR,
-                                "could not alloc memory for discovery log page\n");
+                                "could not alloc %" PRIu64 " discovery log records\n", numrec);
                        errno = ENOMEM;
                        return NULL;
                }
davidrohr commented 8 months ago

Can you try with this patch:

This gives

qon /usr/portage/sys-libs/libnvme # nvme discover -t rdma -a 192.168.110.9 -s 4421
could not alloc 7004333495895549496 discovery log records
failed to get discovery log: Cannot allocate memory

As I wrote in the original post, this numrec entry seems always bogus.

igaw commented 7 months ago

Again, here the exact commands you need to execute:

# nvme connect -t rdma -a XXX -s 4420 -n nqn.2014-08.org.nvmexpress.discovery
[ get the nvme decvice from dmesg, it wont show up in nvme list]
# nvme get-log /dev/nvme$X -i 0x70 -l 4096

Note the device is /dev/nvme1 and not /dev/nvme1n1. And you wont see this device listed by nvme list. dmesg shows when you have connected to the discovery controller.

davidrohr commented 7 months ago

Again, here the exact commands you need to execute:

@igaw : OK, thx a lot, now I understand, wasn't clear to me that the discovery controler is an nvme device by itself.

Please find the logs here: logs.tar.gz

I set up nvmet for tcp over service 4420 and over rdma over 4421, both exposing 3 subsystems with 1 NVME device each. The log contains the 0x70 page for both of them. (Note that I tried to have 1 subsystem with 3 devices, or 3 subsystems with 1 device, and I also changed the service numbers, i.e. had both of them on 4420 separately, but this doesn't change anything. I am only including the log for one case.)

igaw commented 7 months ago

───────┬─────────────────────────────────────────────────────────────────────────────────────────
       │ File: tcp.log
───────┼─────────────────────────────────────────────────────────────────────────────────────────
   1   │ Device:nvme3 log-id:112 namespace-id:0xffffffff
   2   │        0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
   3   │ 0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
───────┴─────────────────────────────────────────────────────────────────────────────────────────
───────┬─────────────────────────────────────────────────────────────────────────────────────────
       │ File: rdma.log
───────┼─────────────────────────────────────────────────────────────────────────────────────────
   1   │ Device:nvme4 log-id:112 namespace-id:0xffffffff
   2   │        0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
   3   │ 0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
───────┴─────────────────────────────────────────────────────────────────────────────────────────

Both transport have the same header, genctr is 9 and numrec is 4. But the command libnvme issues gets something wrong for rdma. Very strange.

igaw commented 7 months ago

@davidrohr could you retry with the latest version from the git trees (libnvme and nvme-cli). @hreinecke updated the status and error handling in the discovery code. There was a bit of confusion in the error handling.

see the readme of nvme-cli to make sure you get the libnvme correct.

davidrohr commented 7 months ago

@igaw : Thx, I rebuild libnvme and nvme-cli from the upstream repo, and now the discovery over rdma is failing with Protocol erro. Also the nvme get-log command fails with the same error now over RDMA, while everything works over TCP.

See the full log here:

qon /usr/portage/sys-apps/nvme-cli # nvme discover -t tcp -a 192.168.110.9 -s 4420

Discovery Log Number of Records 4, Generation counter 9
=====Discovery Log Entry 0======
trtype:  tcp
adrfam:  ipv4
subtype: current discovery subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  nqn.2014-08.org.nvmexpress.discovery
traddr:  192.168.110.9
eflags:  none
sectype: none
=====Discovery Log Entry 1======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  qon1
traddr:  192.168.110.9
eflags:  none
sectype: none
=====Discovery Log Entry 2======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  qon2
traddr:  192.168.110.9
eflags:  none
sectype: none
=====Discovery Log Entry 3======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  qon3
traddr:  192.168.110.9
eflags:  none
sectype: none
qon /usr/portage/sys-apps/nvme-cli # nvme discover -t rdma -a 192.168.110.9 -s 4421
failed to identify controller, error Protocol error
failed to add controller, error Protocol error
qon /usr/portage/sys-apps/nvme-cli # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t rdma -a 192.168.110.9 -s 4421
connecting to device: nvme3
qon /usr/portage/sys-apps/nvme-cli # nvme get-log /dev/nvme3 -i 0x70 -l 8192 > rdma.log
log page: Protocol error
qon /usr/portage/sys-apps/nvme-cli # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t tcp -a 192.168.110.9 -s 4420
connecting to device: nvme4
qon /usr/portage/sys-apps/nvme-cli # nvme get-log /dev/nvme4 -i 0x70 -l 8192 > rdma.log
qon /usr/portage/sys-apps/nvme-cli # cat rdma.log 
Device:nvme4 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
[...]
igaw commented 7 months ago

Alright, this means the error handling catches now the failing commands and we don't blindly continue. This is better but obviously we haven't figured out why commands fail.

The first failing command is already the identify controller. I don't really see how TCP can succeed and RDMA fails. I think it's possible to collect the wire conversation between host and target using wireshark/tcpdump. Could you try to record the RDMA attempt?

davidrohr commented 7 months ago

Hm, I just tried with wireshark, and I can capture the traffic when connecting over TCP (also connecting with TCP over IPoIB on the Infiniband interface). But when discovering with RDMA, wireshark does not catch anything. Also the

qon ~ # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t rdma -a 192.168.110.9 -s 4421
connecting to device: nvme3

which apparently succeeds does not show anything in wireshark.

I can have a look another day if I manage to record something, or if I can trouble-shoot it by adjusting the libnvme code.

igaw commented 7 months ago

Thanks for the experiment. I haven't played with RDMA yet. So it was just a guess it might work. Maybe post the output from the discover command with -vvv. Meanwhile I try to figure out how to proceed.

davidrohr commented 7 months ago

ok, interestingly, trying a few times, I have seen that now sometimes I get the protocol error, but sometimes the memory allocation error I had before. Not sure why yesterday I was getting always the protocol error. Anyhow, logs with -vvv for both cases are below.

qon ~ # nvme discover -t rdma -a 192.168.110.9 -s 4421 -vvv
scan controller nvme0
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme0
scan controller nvme0 path nvme0c0n1
scan controller nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme1
scan controller nvme1 path nvme1c1n1
scan controller nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme2
scan controller nvme2 path nvme2c2n1
scan subsystem nvme-subsys0
scan subsystem nvme-subsys0 namespace nvme0n1
scan subsystem nvme-subsys1
scan subsystem nvme-subsys1 namespace nvme1n1
scan subsystem nvme-subsys2
scan subsystem nvme-subsys2 namespace nvme2n1
Error opening /etc/nvme/config.json, No such file or directory
kernel supports: instance cntlid transport traddr trsvcid nqn queue_size nr_io_queues reconnect_delay ctrl_loss_tmo keep_alive_tmo hostnqn host_traddr host_iface hostid duplicate_connect disable_sqflow hdr_digest data_digest nr_write_queues nr_poll_queues tos fast_io_fail_tmo discovery 
option "keyring" ignored
connect ctrl, 'nqn=nqn.2014-08.org.nvmexpress.discovery,transport=rdma,traddr=192.168.110.9,trsvcid=4421,hostnqn=nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95,hostid=990e5a47-9734-32f6-9ee6-3497f631db95,ctrl_loss_tmo=600'
connect ctrl, response 'instance=3,cntlid=4'
nvme3: nqn.2014-08.org.nvmexpress.discovery connected
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys3/nvme3
opcode       : 06
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00001000
metadata_len : 00000000
addr         : 7ffd78a76b90
metadata     : 0
cdw10        : 00000001
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 050dff80
err          : 0
latency      : 71 us
failed to identify controller, error Protocol error
nvme3: nqn.2014-08.org.nvmexpress.discovery disconnected
failed to add controller, error Protocol error
qon ~ # nvme discover -t rdma -a 192.168.110.9 -s 4421 -vvv
scan controller nvme0
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme0
scan controller nvme0 path nvme0c0n1
scan controller nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme1
scan controller nvme1 path nvme1c1n1
scan controller nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme2
scan controller nvme2 path nvme2c2n1
scan subsystem nvme-subsys0
scan subsystem nvme-subsys0 namespace nvme0n1
scan subsystem nvme-subsys1
scan subsystem nvme-subsys1 namespace nvme1n1
scan subsystem nvme-subsys2
scan subsystem nvme-subsys2 namespace nvme2n1
Error opening /etc/nvme/config.json, No such file or directory
kernel supports: instance cntlid transport traddr trsvcid nqn queue_size nr_io_queues reconnect_delay ctrl_loss_tmo keep_alive_tmo hostnqn host_traddr host_iface hostid duplicate_connect disable_sqflow hdr_digest data_digest nr_write_queues nr_poll_queues tos fast_io_fail_tmo discovery 
option "keyring" ignored
connect ctrl, 'nqn=nqn.2014-08.org.nvmexpress.discovery,transport=rdma,traddr=192.168.110.9,trsvcid=4421,hostnqn=nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95,hostid=990e5a47-9734-32f6-9ee6-3497f631db95,ctrl_loss_tmo=600'
connect ctrl, response 'instance=3,cntlid=4'
nvme3: nqn.2014-08.org.nvmexpress.discovery connected
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys3/nvme3
opcode       : 06
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00001000
metadata_len : 00000000
addr         : 7ffcb771cad0
metadata     : 0
cdw10        : 00000001
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 88 us
nvme3: get header (try 0/10)
opcode       : 02
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00000014
metadata_len : 00000000
addr         : 5646f57a7000
metadata     : 0
cdw10        : 00040070
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 80 us
could not alloc 3977915852761149491 discovery log records
failed to get discovery log: Cannot allocate memory
nvme3: nqn.2014-08.org.nvmexpress.discovery disconnected
igaw commented 7 months ago

ahh maybe it's a buffer alignment problem:

addr : 7ffd78a76b90

hreinecke commented 7 months ago

That really looks like a buffer alignment issue. RDMA is working on pages (that's the 'dma' bit in RDMA ...), so I would expect it to require the buffer to be page aligned. Hmm.

igaw commented 7 months ago

Can you try again after updating nvme-cli?

davidrohr commented 7 months ago

I tried with the latest nvme-cli. Attached is a log with several attempts. Most of them fail with the protocol error. But I also still got the memory allocation error once. And once it also failed with a different error which I have never seen before.

log.gz

igaw commented 7 months ago

The first identify command returns

result : 43ee43ee

This is a really funky value. The rest seems to be okay. I don't have a lot experience with RDMA. Are you sure the host and the target can 'talk' to each other correctly? Is there a way to check this?

davidrohr commented 7 months ago

Yes, as I wrote in the original post, I can mount one of the 3 disks via rdma, and that disk works normally, and I also get descent 1gb/s transfer rate with that disk.

For the other 2 disks, mounting them fails if I connect via rdma, but I can also read them correctly via dd.

igaw commented 7 months ago

I'm hitting a bug which looks very similar as hit. I get it when running a blktests test case with RDMA transport but not with the others transports. This seems to be a kernel thing. Maybe I can shed some light on it with the reproducer in my test bench.

igaw commented 7 months ago

Until now the funky result values have not correctly propagated. This is way I seem these now as well. Still searching were we do not set the result field correctly.

davidrohr commented 7 months ago

Are there any updates on this, can I do anything to help debugging?

igaw commented 7 months ago

I've got a kernel patch for rdma but haven't found time to post it yet. If you could test this, it would really help.

From 9f2480e883d12541289fe6187057991c42b4c589 Mon Sep 17 00:00:00 2001
From: Daniel Wagner <dwagner@suse.de>
Date: Fri, 12 Apr 2024 14:10:02 +0200
Subject: [PATCH] nvmet: always initialize cqe.result

The spec doesn't mandate that the first two double words (aka results)
for the command queue entry need to be set to 0 when they are not
used (not specified). Though, the target implemention returns for all
transports but RDMA always 0 when a command is not using it.

Let's make RDMA behave the same by always explicitly initializing it.
This also prevents leaking any stale memory data.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/core.c             | 1 +
 drivers/nvme/target/fabrics-cmd-auth.c | 3 ---
 drivers/nvme/target/fabrics-cmd.c      | 6 ------
 3 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 20b138ed571f..3c43d9612947 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -946,6 +946,7 @@ bool nvmet_req_init(struct nvmet_req *req, struct nvmet_cq *cq,
    req->metadata_sg_cnt = 0;
    req->transfer_len = 0;
    req->metadata_len = 0;
+   req->cqe->result.u64 = 0;
    req->cqe->status = 0;
    req->cqe->sq_head = 0;
    req->ns = NULL;
diff --git a/drivers/nvme/target/fabrics-cmd-auth.c b/drivers/nvme/target/fabrics-cmd-auth.c
index eb7785be0ca7..ee76491e8b12 100644
--- a/drivers/nvme/target/fabrics-cmd-auth.c
+++ b/drivers/nvme/target/fabrics-cmd-auth.c
@@ -332,7 +332,6 @@ void nvmet_execute_auth_send(struct nvmet_req *req)
        pr_debug("%s: ctrl %d qid %d nvme status %x error loc %d\n",
             __func__, ctrl->cntlid, req->sq->qid,
             status, req->error_loc);
-   req->cqe->result.u64 = 0;
    if (req->sq->dhchap_step != NVME_AUTH_DHCHAP_MESSAGE_SUCCESS2 &&
        req->sq->dhchap_step != NVME_AUTH_DHCHAP_MESSAGE_FAILURE2) {
        unsigned long auth_expire_secs = ctrl->kato ? ctrl->kato : 120;
@@ -515,8 +514,6 @@ void nvmet_execute_auth_receive(struct nvmet_req *req)
    status = nvmet_copy_to_sgl(req, 0, d, al);
    kfree(d);
 done:
-   req->cqe->result.u64 = 0;
-
    if (req->sq->dhchap_step == NVME_AUTH_DHCHAP_MESSAGE_SUCCESS2)
        nvmet_auth_sq_free(req->sq);
    else if (req->sq->dhchap_step == NVME_AUTH_DHCHAP_MESSAGE_FAILURE1) {
diff --git a/drivers/nvme/target/fabrics-cmd.c b/drivers/nvme/target/fabrics-cmd.c
index b23f4cf840bd..f6714453b8bb 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -226,9 +226,6 @@ static void nvmet_execute_admin_connect(struct nvmet_req *req)
    if (status)
        goto out;

-   /* zero out initial completion result, assign values as needed */
-   req->cqe->result.u32 = 0;
-
    if (c->recfmt != 0) {
        pr_warn("invalid connect version (%d).\n",
            le16_to_cpu(c->recfmt));
@@ -304,9 +301,6 @@ static void nvmet_execute_io_connect(struct nvmet_req *req)
    if (status)
        goto out;

-   /* zero out initial completion result, assign values as needed */
-   req->cqe->result.u32 = 0;
-
    if (c->recfmt != 0) {
        pr_warn("invalid connect version (%d).\n",
            le16_to_cpu(c->recfmt));
-- 
2.44.0
davidrohr commented 7 months ago

I just tested with your patch (applied to 6.8 kernel / gentoo linux kernel 6.8.7), and used libvnme and nvme-cli from git with master of today. It still doesn't work and fails in the same way before. Here are the logs:

qon ~ # nvme discover -t rdma -a 192.168.110.9 -s 4421 -vvv
scan controller nvme0
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme0
scan controller nvme0 path nvme0c0n1
scan controller nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme1
scan controller nvme1 path nvme1c1n1
scan controller nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme2
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme2
scan controller nvme2 path nvme2c2n1
scan subsystem nvme-subsys0
scan subsystem nvme-subsys0 namespace nvme0n1
scan subsystem nvme-subsys1
scan subsystem nvme-subsys1 namespace nvme1n1
scan subsystem nvme-subsys2
scan subsystem nvme-subsys2 namespace nvme2n1
Error opening /etc/nvme/config.json, No such file or directory
kernel supports: instance cntlid transport traddr trsvcid nqn queue_size nr_io_queues reconnect_delay ctrl_loss_tmo keep_alive_tmo hostnqn host_traddr host_iface hostid duplicate_connect disable_sqflow hdr_digest data_digest nr_write_queues nr_poll_queues tos fast_io_fail_tmo discovery 
option "keyring" ignored
connect ctrl, 'nqn=nqn.2014-08.org.nvmexpress.discovery,transport=rdma,traddr=192.168.110.9,trsvcid=4421,hostnqn=nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95,hostid=990e5a47-9734-32f6-9ee6-3497f631db95,ctrl_loss_tmo=600'
connect ctrl, response 'instance=3,cntlid=4'
nvme3: nqn.2014-08.org.nvmexpress.discovery connected
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys2/nvme3
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys3/nvme3
opcode       : 06
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00001000
metadata_len : 00000000
addr         : 55d46e2aa000
metadata     : 0
cdw10        : 00000001
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 70 us
nvme3: get header (try 0/10)
opcode       : 02
flags        : 00
rsvd1        : 0000
nsid         : 00000000
cdw2         : 00000000
cdw3         : 00000000
data_len     : 00000014
metadata_len : 00000000
addr         : 55d46e2aa000
metadata     : 0
cdw10        : 00040070
cdw11        : 00000000
cdw12        : 00000000
cdw13        : 00000000
cdw14        : 00000000
cdw15        : 00000000
timeout_ms   : 00000000
result       : 00000000
err          : 0
latency      : 76 us
could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory
nvme3: nqn.2014-08.org.nvmexpress.discovery disconnected
qon ~ # nvme connect -n nqn.2014-08.org.nvmexpress.discovery -t rdma -a 192.168.110.9 -s 4421
connecting to device: nvme3
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 8192
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
02f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
03f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0400: 01 01 03 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
0410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0420: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
0430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
04f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0500: 6e 71 6e 2e 32 30 31 34 2d 30 38 2e 6f 72 67 2e "nqn.2014-08.org."
0510: 6e 76 6d 65 78 70 72 65 73 73 2e 64 69 73 63 6f "nvmexpress.disco"
0520: 76 65 72 79 00 00 00 00 00 00 00 00 00 00 00 00 "very............"
0530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
05f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0600: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
0610: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0630: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0640: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0650: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0660: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0670: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0690: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
06f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0700: 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0720: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
07f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0800: 01 01 02 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
0810: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0820: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
0830: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0850: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0890: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
08f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0900: 71 6f 6e 31 00 00 00 00 00 00 00 00 00 00 00 00 "qon1............"
0910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
09f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a00: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
0a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b00: 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c00: 01 01 02 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
0c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c20: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
0c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ce0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0cf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d00: 71 6f 6e 32 00 00 00 00 00 00 00 00 00 00 00 00 "qon2............"
0d10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e00: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
0e10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0e90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f00: 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0f90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
0ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1000: 01 01 02 04 02 00 ff ff 20 00 00 00 00 00 00 00 "................"
1010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1020: 34 34 32 31 00 00 00 00 00 00 00 00 00 00 00 00 "4421............"
1030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
10f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1100: 71 6f 6e 33 00 00 00 00 00 00 00 00 00 00 00 00 "qon3............"
1110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
11f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1200: 31 39 32 2e 31 36 38 2e 31 31 30 2e 39 00 00 00 "192.168.110.9..."
1210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
1240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 "................"
igaw commented 7 months ago

Alright, so the results garbage is not really making the host confused. Slowly running out of idea. The only difference between what libnvme is doing and the shell command is the transfer length. Could you limit it to 20 bytes and see if this changes anything?

# nvme get-log /dev/nvme3 -i 0x70 -l 20
davidrohr commented 7 months ago

I executed the command a couple of times. Most times the result seems correct, but once apparently I got garbage:

qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 00 00 00 00 35 66 36 65 38 30 33 62 33 39 32 64 "....5f6e803b392d"
0010: 39 37 36 37                                      "9767"
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."
qon ~ # nvme get-log /dev/nvme3 -i 0x70 -l 20
Device:nvme3 log-id:112 namespace-id:0xffffffff
       0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
0000: 09 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 "................"
0010: 00 00 00 00                                      "...."
igaw commented 7 months ago

Either the transport is having troubles or the target implementation is at fault. I think it's more likely that the target is having some sort of race and libnvme is able to hit it very reliable.

libnvme is sending two commands back to back and the second gets corrupted data. I suspect from your last test if we would wait a bit between the commands the second command could succeed.

Maybe we see something on the ftrace (on the target):

 # cd /sys/kernel/debug/tracing/
 # echo 1 > events/nvmet/enable
 # echo 1 > tracing_on

[wait for the failure]

 # echo 0 > tracing_on
 # cat trace > ~/nvmet-trace.txt

and also could you enable the debug logs on the target side?

 # echo 8 > /proc/sys/kernel/printk
 # echo 'file drivers/nvme/target/*.c +p' > /sys/kernel/debug/dynamic_debug/control
davidrohr commented 7 months ago

Sure, will do so. But I'll be traveling the next days, so only have access to the system in ~1 week.

igaw commented 5 months ago

Still interested to get this working? In the meantime the kernel patch has been accepted. So the random results values should be gone. But unfortunately, this was the cause of the problem. Still need the logs from above.

davidrohr commented 5 months ago

Sure, I'd like to have this fixed. I tried the trace command, but didn't work. I assume since something was not enabled in the kernel. Need to check. But then I was again traveling the last 4 weeks and couldn't work on this. Sorry for the slow process. I'll report back when I have something, and will try with latest kernel / sw when I am back.

igaw commented 5 months ago

Okay, no worries, I am busy too.

I've double checked the commands and on my system they work. So it really sounds like your missing the kernel config options. Are you running a distro kernel? In any case the CONFIG_FTRACE and CONFIG_DYNAMIC_DEBUG options would be necessary.

davidrohr commented 3 months ago

I finally found some time to get the traces with the CONFIG_DYNAMIC_DEBUG trace file from the target is attached, taken during the nvme discover ... operations, which fails. For reference, here is the dmesg during this time:

[Mon Aug 12 20:55:16 2024] nvmet_rdma: connect request (4): status 0 id 00000000ff1fce59
[Mon Aug 12 20:55:16 2024] nvmet_rdma: added mlx4_0.
[Mon Aug 12 20:55:16 2024] nvmet_rdma: nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 509 cm_id= 00000000ff1fce59
[Mon Aug 12 20:55:16 2024] nvmet_rdma: established (9): status 0 id 00000000ff1fce59
[Mon Aug 12 20:55:16 2024] nvmet: ctrl 1 start keep-alive timer for 120 secs
[Mon Aug 12 20:55:16 2024] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:990e5a47-9734-32f6-9ee6-3497f631db95.
[Mon Aug 12 20:55:17 2024] nvmet_rdma: disconnected (10): status 0 id 00000000ff1fce59
[Mon Aug 12 20:55:17 2024] nvmet_rdma: cm_id= 00000000ff1fce59 queue->state= 1
[Mon Aug 12 20:55:17 2024] nvmet_rdma: freeing queue 0
[Mon Aug 12 20:55:17 2024] nvmet: ctrl 1 stop keep-alive

trace.gz

igaw commented 3 months ago

So the disconnected (10): status 0 id 00000000ff1fce59 is from nvmet_rdma_cm_handler function. This really looks like that the RDMA driver is unhappy with something. I can't really tell, I don't have access to the spec. I suggest you bring this up on the nvme/rdma mailing list. Sorry about that.

davidrohr commented 1 month ago

Just for reference, I had the chance to test with ConnectX6 NICs instead of ConnectX3, and I got the very same problem. Otherwise, didn't have time to investigate further / report.

davidrohr commented 2 weeks ago

Some more info for reference: As I said, I can do nvme connect despite the nvme discover failure, and then I see the disk as block device. I tried reading from the block device with dd with various block sizes, and every ready returns the correct result (I was thinking perhaps it depends on the read size, and small reads as for the discover might fail).

I have also ran ib_write_bw, ib_read_bw and ib_send_bw benchmarks with all transfer sizes with both systems both as client and server, all worked nicely.

davidrohr commented 2 weeks ago

What I am wondering as a side note: Didn't you say the error propagation should be fixed now? Since for me, when it fails the nvme discover still reports

could not alloc memory for discovery log page
failed to get discovery log: Cannot allocate memory

Shouldn't it produce a proper error message meanwhile? Just tried with nvme-cli and libnvme master of today.

igaw commented 2 weeks ago

I had to revert those error propagation changes. I understood it wrongly how the error path works. Those changes broke a bunch of stuff.

I have to look into this once again, maybe this time I see something. There must be something be very weird about the get log page command if connect just works.

Maybe some other admin commands would fail too, e.g. nvme id-ctrl or nvme id-ns.

One thing which could be wrong with the get-log command is that it doesn't set the namespace id to 0. To really match with what the discover code in libnvme does, would be

nvme get-log /dev/nvme3 -i 0x70 -l 20 -n 0 -vv

But again, we know that the discover command setup by nvme-cli also fails and sees garbage in the return buffer.

I wonder if any of the other admin commands also fail? This could point to the admin handling code in nvmet. Could you try to trigger an error with nvme id-ctrl /dev/nvme3, e.g. while nvme id-ctrl /dev/nvme3 ; do true ; done

igaw commented 2 weeks ago

Also can you do a test if nvme get-log /dev/nvme3 -i 0x70 -l 4096 -n 0 -vv ever returns garbage? Trying to figure out if the SGL inlining doesn't always work.