Open sunilkumarn417 opened 6 months ago
http://quay.io/barakda1/ceph:47ea673ae9ebf51b2ebc505093bd7272422045e4 quay.io/barakda1/nvmeof:8677ba3 quay.io/barakda1/nvmeof-cli:8677ba3
@sunilkumarn417 can you please add the output of: host list -n
@sunilkumarn417 can you please add the output of: host list -n , for both subsystems.
[root@ceph-1sunilkumar-z1afhw-node6 cephuser]# podman run --rm -it quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.208.213 --server-port 5500 host list -n nqn.2016-06.io.spdk:sub2 Hosts allowed to access nqn.2016-06.io.spdk:sub2: ╒════════════╕ │ Host NQN │ ╞════════════╡ │ Any host │ ╘════════════╛ [root@ceph-1sunilkumar-z1afhw-node6 cephuser]# podman run --rm -it quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.208.213 --server-port 5500 host list -n nqn.2016-06.io.spdk:sub1 Hosts allowed to access nqn.2016-06.io.spdk:sub1: ╒════════════╕ │ Host NQN │ ╞════════════╡ │ Any host │ ╘════════════╛
This happens because for some reason 10.0.208.213 is optimized only on grp3, and 10.0.209.68 is inaccessible on all. At least that's what I see when I log into the systems now. The 2 namespaces that belong to sub1 are on grp 1, so they're currently not optimized on any listener so we cannot see them. We need to understand how it got to the situation where grp 1 is not optimized on any gw. I suspect it might be related to the issue that we have, that we don't reassign the same grp id to the same gw, after removing a gw. Not sure. @sunilkumarn417 can you describe the steps you did to cause failover. Did you also did any ceph adm command or other command to remove/add a gw? Also can you open the mon logs to file on this setup?
@caroav These are the steps I followed,
Ceph Nodes Inventory
10.0.210.141 ceph-1sunilkumar-z1afhw-node1-installer ceph-1sunilkumar-z1afhw-node1-installer. - MON, MGR
10.0.211.144 ceph-1sunilkumar-z1afhw-node2 ceph-1sunilkumar-z1afhw-node2 - MON, MGR
10.0.208.216 ceph-1sunilkumar-z1afhw-node3 ceph-1sunilkumar-z1afhw-node3. - MON, OSD Node
10.0.211.89 ceph-1sunilkumar-z1afhw-node4 ceph-1sunilkumar-z1afhw-node4. - OSD Node
10.0.211.212 ceph-1sunilkumar-z1afhw-node5 ceph-1sunilkumar-z1afhw-node5. - OSD Node
10.0.208.213 ceph-1sunilkumar-z1afhw-node6 ceph-1sunilkumar-z1afhw-node6. - NVMeoF GW
10.0.209.68 ceph-1sunilkumar-z1afhw-node7 ceph-1sunilkumar-z1afhw-node7. - NVMeoF GW
10.0.210.4 ceph-1sunilkumar-z1afhw-node8 ceph-1sunilkumar-z1afhw-node8. - Client
10.0.208.67 ceph-1sunilkumar-z1afhw-node9 ceph-1sunilkumar-z1afhw-node9
node6
and node7
.ceph orch daemon rm nvmeofgw.node6
daemon removal.nqn.2016-06.io.spdk:sub1
nqn.2016-06.io.spdk:sub2
.host *
.sub1_image1
sub1_image2
attached with Load balancing group Id 1 under Subsystem1.sub2_image1
sub2_image2
attached with Load balancing group Id 3 under Subsystem2.nvme connect-all
and noticed only images from sub2
are visible.Able to hit the issue again.
Able to reproduce this issue in another cluster which I created as well.
GW1
[root@ceph-rbd1-mytest-rxmvqg-node4 ~]# podman run quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.210.179 --server-port 5500 gw info CLI's version: 1.0.0 Gateway's version: 1.0.0 Gateway's name: client.nvmeof.nvmeof.ceph-rbd1-mytest-rxmvqg-node4.bxkxze Gateway's load balancing group: 2 Gateway's address: 10.0.210.179 Gateway's port: 5500 SPDK version: 23.01.1
[root@ceph-rbd1-mytest-rxmvqg-node4 src]# /usr/libexec/spdk/scripts/rpc.py nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24 [ { "address": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "10.0.210.179", "trsvcid": "4420" }, "ana_states": [ { "ana_group": 1, "ana_state": "inaccessible" }, { "ana_group": 2, "ana_state": "inaccessible" }, { "ana_group": 3, "ana_state": "inaccessible" }, { "ana_group": 4, "ana_state": "inaccessible"
[root@ceph-rbd1-mytest-rxmvqg-node4 src]# /usr/libexec/spdk/scripts/rpc.py nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24 [ { "address": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "10.0.210.179", "trsvcid": "4420" }, "ana_states": [ { "ana_group": 1, "ana_state": "inaccessible" }, { "ana_group": 2, "ana_state": "inaccessible" }, { "ana_group": 3, "ana_state": "inaccessible" }, { "ana_group": 4, "ana_state": "inaccessible"
GW2
[root@ceph-rbd1-mytest-rxmvqg-node5 ~]# podman run quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.208.28 --server-port 5500 gw info CLI's version: 1.0.0 Gateway's version: 1.0.0 Gateway's name: client.nvmeof.nvmeof.ceph-rbd1-mytest-rxmvqg-node5.yovvcu Gateway's load balancing group: 1 Gateway's address: 10.0.208.28 Gateway's port: 5500 SPDK version: 23.01.1
[root@ceph-rbd1-mytest-rxmvqg-node5 src]# /usr/libexec/spdk/scripts/rpc.py nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24 [ { "address": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "10.0.208.28", "trsvcid": "4420" }, "ana_states": [ { "ana_group": 1, "ana_state": "optimized" }, { "ana_group": 2, "ana_state": "inaccessible" }, { "ana_group": 3, "ana_state": "inaccessible" }, { "ana_group": 4, "ana_state": "inaccessible"
[root@ceph-rbd1-mytest-rxmvqg-node5 src]# /usr/libexec/spdk/scripts/rpc.py nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24 [ { "address": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "10.0.208.28", "trsvcid": "4420" }, "ana_states": [ { "ana_group": 1, "ana_state": "optimized" }, { "ana_group": 2, "ana_state": "inaccessible" }, { "ana_group": 3, "ana_state": "inaccessible" }, { "ana_group": 4, "ana_state": "inaccessible"
On client:
[root@ceph-rbd1-mytest-rxmvqg-node6 ~]# nvme list-subsys nvme-subsys3 - NQN=nqn.2016-06.io.spdk:cnode2 \ +- nvme3 tcp traddr=10.0.210.179,trsvcid=4420,src_addr=10.0.208.169 live +- nvme4 tcp traddr=10.0.208.28,trsvcid=4420,src_addr=10.0.208.169 live nvme-subsys1 - NQN=nqn.2016-06.io.spdk:cnode1 \ +- nvme2 tcp traddr=10.0.208.28,trsvcid=4420,src_addr=10.0.208.169 live +- nvme1 tcp traddr=10.0.210.179,trsvcid=4420,src_addr=10.0.208.169 live [root@ceph-rbd1-mytest-rxmvqg-node6 ~]# nvme list-subsys /dev/nvme3n1 nvme-subsys3 - NQN=nqn.2016-06.io.spdk:cnode2 \ +- nvme3 tcp traddr=10.0.210.179,trsvcid=4420,src_addr=10.0.208.169 live inaccessible +- nvme4 tcp traddr=10.0.208.28,trsvcid=4420,src_addr=10.0.208.169 live optimized [root@ceph-rbd1-mytest-rxmvqg-node6 ~]# nvme list-subsys /dev/nvme1n1 nvme-subsys1 - NQN=nqn.2016-06.io.spdk:cnode1 \ +- nvme2 tcp traddr=10.0.208.28,trsvcid=4420,src_addr=10.0.208.169 live +- nvme1 tcp traddr=10.0.210.179,trsvcid=4420,src_addr=10.0.208.169 live
`[root@ceph-rbd1-mytest-rxmvqg-node6 ~]# nvme list Node Generic SN Model Namespace Usage Format FW Rev
/dev/nvme3n5 /dev/ng3n5 2 Ceph bdev Controller 0x5 536.87 GB / 536.87 GB 512 B + 0 B 23.01.1 /dev/nvme3n4 /dev/ng3n4 2 Ceph bdev Controller 0x4 536.87 GB / 536.87 GB 512 B + 0 B 23.01.1 /dev/nvme3n3 /dev/ng3n3 2 Ceph bdev Controller 0x3 536.87 GB / 536.87 GB 512 B + 0 B 23.01.1 /dev/nvme3n2 /dev/ng3n2 2 Ceph bdev Controller 0x2 536.87 GB / 536.87 GB 512 B + 0 B 23.01.1 /dev/nvme3n1 /dev/ng3n1 2 Ceph bdev Controller 0x1 536.87 GB / 536.87 GB 512 B + 0 B 23.01.1 `
The issue is that the nvmeof monitor DB has zomobie gws. It is known and being taken care of. For now, the only way to avoid this issue is:
Unable to list namespaces hosted from one of the subsytem which is directly asssociated to Gateway (say GW1) Via load-balancing-group id (say 1).
At Client Side
As we can notice below, the namespaces from subsystem1 is not connected.
ANA States from both Gateways