ceph / ceph-nvmeof

Service to provide Ceph storage over NVMe-oF/TCP protocol
GNU Lesser General Public License v3.0
85 stars 45 forks source link

Redeploying nvme daemon makes the ana-group for the corresponding gateway inaccessible #516

Closed manasagowri closed 5 months ago

manasagowri commented 6 months ago

Redeploying nvme daemon makes the ana-group for the corresponding gateway inaccessible, and the corresponding nvme disks disappear from the client.

Before failover

[root@ceph-rbd2-mytest-bgzmwr-node4 ~]# podman run quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.103.241 --server-port 5500 namespace get_io_stats -n nqn.2016-06.io.spdk:cnode1 --nsid 1
IO statistics for namespace 1 in nqn.2016-06.io.spdk:cnode1, bdev bdev_fef1765d-e570-4149-b218-37c5542c0a7b:
╒═════════════════════════╤═════════════════╕
│ Stat                    │ Value           │
╞═════════════════════════╪═════════════════╡
│ Tick Rate               │ 2190000000      │
├─────────────────────────┼─────────────────┤
│ Ticks                   │ 348797824638231 │
├─────────────────────────┼─────────────────┤
│ Bytes Read              │ 85156864        │
├─────────────────────────┼─────────────────┤
│ Num Read Ops            │ 1519            │
├─────────────────────────┼─────────────────┤
│ Bytes Written           │ 3951312896      │
├─────────────────────────┼─────────────────┤
│ Num Write Ops           │ 33541           │
├─────────────────────────┼─────────────────┤
│ Bytes Unmapped          │ 0               │
├─────────────────────────┼─────────────────┤
│ Num Unmap Ops           │ 0               │
├─────────────────────────┼─────────────────┤
│ Read Latency Ticks      │ 39579511032     │
├─────────────────────────┼─────────────────┤
│ Max Read Latency Ticks  │ 545029982       │
├─────────────────────────┼─────────────────┤
│ Min Read Latency Ticks  │ 394656          │
├─────────────────────────┼─────────────────┤
│ Write Latency Ticks     │ 956735127508    │
├─────────────────────────┼─────────────────┤
│ Max Write Latency Ticks │ 287864370       │
├─────────────────────────┼─────────────────┤
│ Min Write Latency Ticks │ 38962           │
├─────────────────────────┼─────────────────┤
│ Unmap Latency Ticks     │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Unmap Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Unmap Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Copy Latency Ticks      │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Copy Latency Ticks  │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Copy Latency Ticks  │ 0               │
├─────────────────────────┼─────────────────┤
│ IO Error                │ []              │
╘═════════════════════════╧═════════════════╛

GW Info

[root@ceph-rbd2-mytest-bgzmwr-node4 ~]# podman run quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.103.241 --server-port 5500 gw info
CLI's version: 1.0.0
Gateway's version: 1.0.0
Gateway's name: client.nvmeof.nvmeof.ceph-rbd2-mytest-bgzmwr-node4.cuatjv
Gateway's load balancing group: 2
Gateway's address: 10.0.103.241
Gateway's port: 5500
SPDK version: 23.01.1
[root@ceph-rbd2-mytest-bgzmwr-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.103.241",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "optimized"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-rbd2-mytest-bgzmwr-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.103.241",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "optimized"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

Redeployed GW1’s nvme daemon using ceph orch daemon redeploy, and IOs stop executing even after daemon is back.

[root@ceph-rbd2-mytest-bgzmwr-node6 ~]# ceph orch daemon redeploy nvmeof.nvmeof.ceph-rbd2-mytest-bgzmwr-node4.cuatjv
Scheduled to redeploy nvmeof.nvmeof.ceph-rbd2-mytest-bgzmwr-node4.cuatjv on host 'ceph-rbd2-mytest-bgzmwr-node4'

[root@ceph-rbd2-mytest-bgzmwr-node6 ~]# ceph orch ps --daemon-type nvmeof
NAME                                                HOST                           PORTS             STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID  
nvmeof.nvmeof.ceph-rbd2-mytest-bgzmwr-node4.cuatjv  ceph-rbd2-mytest-bgzmwr-node4  *:5500,4420,8009  running (48s)    45s ago   2h    52.7M        -  1.0.0    a647a0311a69  d18b3036f3f1  
nvmeof.nvmeof.ceph-rbd2-mytest-bgzmwr-node5.gqqfhf  ceph-rbd2-mytest-bgzmwr-node5  *:5500,4420,8009  running (2h)     10m ago   2h     143M        -  1.0.0    a647a0311a69  1878bdf5d0f2

Disks related to GW1 disappear from nvme list on client

[root@ceph-rbd2-mytest-bgzmwr-node6 ~]# nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme3n5          /dev/ng3n5            2                    Ceph bdev Controller                     0x5        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme3n4          /dev/ng3n4            2                    Ceph bdev Controller                     0x4        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme3n3          /dev/ng3n3            2                    Ceph bdev Controller                     0x3        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme3n2          /dev/ng3n2            2                    Ceph bdev Controller                     0x2        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme3n1          /dev/ng3n1            2                    Ceph bdev Controller                     0x1        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 

Ana-group 2 becomes inaccessible on both GW1 and GW2 GW1

[root@ceph-rbd2-mytest-bgzmwr-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.103.241",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-rbd2-mytest-bgzmwr-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.103.241",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

GW2

[root@ceph-rbd2-mytest-bgzmwr-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.100.215",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-rbd2-mytest-bgzmwr-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.100.215",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

Rbd perf io stats and namespace io_stats become 0 on both GWs

[root@ceph-rbd2-mytest-bgzmwr-node6 ~]# rbd perf image iostat nvme_test
rbd: waiting for initial image stats
NAME      WR    RD   WR_BYTES    RD_BYTES    WR_LAT      RD_LAT 
image_0  0/s   0/s   26 KiB/s   7.2 KiB/s   3.87 ms   579.68 us 
image_2  0/s   0/s   26 KiB/s   7.2 KiB/s   3.35 ms   579.51 us 
image_1  0/s   0/s      0 B/s   7.2 KiB/s   0.00 ns   704.69 us 
image_3  0/s   0/s      0 B/s   7.2 KiB/s   0.00 ns     1.01 ms 
image_4  0/s   0/s      0 B/s   7.2 KiB/s   0.00 ns   689.36 us 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
NAME  WR   RD   WR_BYTES   RD_BYTES   WR_LAT   RD_LAT 
[root@ceph-rbd2-mytest-bgzmwr-node4 ~]# podman run quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.103.241 --server-port 5500 namespace get_io_stats -n nqn.2016-06.io.spdk:cnode1 --nsid 1
IO statistics for namespace 1 in nqn.2016-06.io.spdk:cnode1, bdev bdev_fef1765d-e570-4149-b218-37c5542c0a7b:
╒═════════════════════════╤═════════════════╕
│ Stat                    │ Value           │
╞═════════════════════════╪═════════════════╡
│ Tick Rate               │ 2190000000      │
├─────────────────────────┼─────────────────┤
│ Ticks                   │ 349745906967349 │
├─────────────────────────┼─────────────────┤
│ Bytes Read              │ 36864           │
├─────────────────────────┼─────────────────┤
│ Num Read Ops            │ 2               │
├─────────────────────────┼─────────────────┤
│ Bytes Written           │ 0               │
├─────────────────────────┼─────────────────┤
│ Num Write Ops           │ 0               │
├─────────────────────────┼─────────────────┤
│ Bytes Unmapped          │ 0               │
├─────────────────────────┼─────────────────┤
│ Num Unmap Ops           │ 0               │
├─────────────────────────┼─────────────────┤
│ Read Latency Ticks      │ 12475202        │
├─────────────────────────┼─────────────────┤
│ Max Read Latency Ticks  │ 6445636         │
├─────────────────────────┼─────────────────┤
│ Min Read Latency Ticks  │ 6029566         │
├─────────────────────────┼─────────────────┤
│ Write Latency Ticks     │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Write Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Write Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Unmap Latency Ticks     │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Unmap Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Unmap Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Copy Latency Ticks      │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Copy Latency Ticks  │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Copy Latency Ticks  │ 0               │
├─────────────────────────┼─────────────────┤
│ IO Error                │ []              │
╘═════════════════════════╧═════════════════╛

GW2

[root@ceph-rbd2-mytest-bgzmwr-node5 ~]# podman run quay.io/barakda1/nvmeof-cli:8677ba3 --server-address 10.0.100.215 --server-port 5500 namespace get_io_stats -n nqn.2016-06.io.spdk:cnode1 --nsid 1
IO statistics for namespace 1 in nqn.2016-06.io.spdk:cnode1, bdev bdev_fef1765d-e570-4149-b218-37c5542c0a7b:
╒═════════════════════════╤═════════════════╕
│ Stat                    │ Value           │
╞═════════════════════════╪═════════════════╡
│ Tick Rate               │ 2190000000      │
├─────────────────────────┼─────────────────┤
│ Ticks                   │ 349647129599371 │
├─────────────────────────┼─────────────────┤
│ Bytes Read              │ 36864           │
├─────────────────────────┼─────────────────┤
│ Num Read Ops            │ 2               │
├─────────────────────────┼─────────────────┤
│ Bytes Written           │ 0               │
├─────────────────────────┼─────────────────┤
│ Num Write Ops           │ 0               │
├─────────────────────────┼─────────────────┤
│ Bytes Unmapped          │ 0               │
├─────────────────────────┼─────────────────┤
│ Num Unmap Ops           │ 0               │
├─────────────────────────┼─────────────────┤
│ Read Latency Ticks      │ 3961130         │
├─────────────────────────┼─────────────────┤
│ Max Read Latency Ticks  │ 2102196         │
├─────────────────────────┼─────────────────┤
│ Min Read Latency Ticks  │ 1858934         │
├─────────────────────────┼─────────────────┤
│ Write Latency Ticks     │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Write Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Write Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Unmap Latency Ticks     │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Unmap Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Unmap Latency Ticks │ 0               │
├─────────────────────────┼─────────────────┤
│ Copy Latency Ticks      │ 0               │
├─────────────────────────┼─────────────────┤
│ Max Copy Latency Ticks  │ 0               │
├─────────────────────────┼─────────────────┤
│ Min Copy Latency Ticks  │ 0               │
├─────────────────────────┼─────────────────┤
│ IO Error                │ []              │
╘═════════════════════════╧═════════════════╛
caroav commented 6 months ago

This is also related to the known RM issues. There are already few issues opened on that. It is a known issue, and a fix will be provided to fix it soon.

caroav commented 5 months ago

Fixed in 1.2.1.