ceph / ceph-nvmeof

Service to provide Ceph storage over NVMe-oF/TCP protocol
GNU Lesser General Public License v3.0
87 stars 46 forks source link

NVMe `connect-all` command fails to connect to a specific subsystem. #552

Open sunilkumarn417 opened 7 months ago

sunilkumarn417 commented 7 months ago

NVMe connect-all command fails when provided specifc nqn subsystem command option to connect. Currently, If user has to connect to all listeners for a subsystem, the nvme connect is the only option and which has to be executed one by one for every listener.

[root@ceph-sunilkumar-00-w2h431-node8 cephuser]# nvme connect-all --nqn nqn.2016-06.io.spdk:cnode2 -t tcp -a 10.0.209.24 -s 8009 -l 3600
Failed to write to /dev/nvme-fabrics: Invalid argument
failed to add controller, error invalid arguments/configuration

dmesg log
----------
[587096.953415] nvme nvme0: Subsystem nqn.2016-06.io.spdk:cnode2 is not a discovery controller

[root@ceph-sunilkumar-00-w2h431-node8 cephuser]# nvme connect-all  -t tcp -a 10.0.209.24 -s 8009 -l 3600
[root@ceph-sunilkumar-00-w2h431-node8 cephuser]# echo $?
0

dmesg log
------------
[587271.003842] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.209.24:8009
[587271.027795] nvme nvme1: creating 7 I/O queues.
[587271.095729] nvme nvme1: mapped 7/0/0 default/read/poll queues.
[587271.099494] nvme nvme1: new ctrl: NQN "nqn.2016-06.io.spdk:cnode1", addr 10.0.209.24:5001
[587271.115125] nvme nvme2: creating 7 I/O queues.
[587271.183748] nvme nvme2: mapped 7/0/0 default/read/poll queues.
[587271.186883] nvme nvme2: new ctrl: NQN "nqn.2016-06.io.spdk:cnode1", addr 10.0.210.119:5001
[587271.196593] nvme nvme3: creating 7 I/O queues.
[587271.265173] nvme nvme3: mapped 7/0/0 default/read/poll queues.
[587271.268088] nvme nvme3: new ctrl: NQN "nqn.2016-06.io.spdk:cnode2", addr 10.0.209.24:5001
[587271.275324] nvme nvme4: creating 7 I/O queues.
[587271.343465] nvme nvme4: mapped 7/0/0 default/read/poll queues.
[587271.347535] nvme nvme4: new ctrl: NQN "nqn.2016-06.io.spdk:cnode2", addr 10.0.210.119:5001
[587271.348133] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
gbregman commented 6 months ago

@sunilkumarn417 are you sure that nqn parameter in connect-all does what you think it does? Looking at the source code of the nvme CLI it seems that the nqn parameter is an nqn for a discovery controller, not an IO one. And as in our system the discovery controllers nqn is predefined and they are all equal this doesn't seem to have any advantage. Also notice that the nqn parameter for conndct-all is new. It's not found in version 1.x of the nvme cli.

sunilkumarn417 commented 6 months ago

Thanks @gbregman. I was just thinking of customer use case, where to connect mutliple subsystems from multiple clients using one command in each node, rather than usingconnect command with all listener endpoints. Probably then connect is only option to users to connect to a subsytem with all listener endpoint with multiple CLI calls.