ceph / ceph-nvmeof

Service to provide Ceph storage over NVMe-oF/TCP protocol
GNU Lesser General Public License v3.0
89 stars 46 forks source link

Removing nvmeof service doesn't deletes OMAP entries #377

Open sunilkumarn417 opened 10 months ago

sunilkumarn417 commented 10 months ago

Noticed that OMAP entries for GW entities like subsytem to namespaces are still exists even after removing the entire service from the cluster.

Steps to follow:

[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]# ceph orch ls
NAME                       PORTS        RUNNING  REFRESHED  AGE  PLACEMENT  
alertmanager               ?:9093,9094      1/1  8m ago     5d   count:1    
ceph-exporter                               6/6  8m ago     5d   *          
crash                                       6/6  8m ago     5d   *          
grafana                    ?:3000           1/1  8m ago     5d   count:1    
mgr                                         2/2  8m ago     5d   label:mgr  
mon                                         3/3  8m ago     5d   label:mon  
node-exporter              ?:9100           6/6  8m ago     5d   *          
osd.all-available-devices                    16  5m ago     5d   *          
prometheus                 ?:9095           1/1  8m ago     5d   count:1 

[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]# ceph orch ps | grep nvme
[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]# 

[ceph: root@ceph-1sunilkumar-ol18l6-node1-installer /]# rados -p rbd listomapkeys nvmeof.None.state
host_nqn.2016-06.io.spdk:test_cli_*
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node5.mnoqha_TCP_10.0.211.131_4420
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node5.mnoqha_TCP_10.0.211.32_4420
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node5.mnoqha_TCP_10.0.211.32_4421
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node6.ueawqa_TCP_10.0.211.22_4420
listener_nqn.2016-06.io.spdk:test_cli_client.nvmeof.rbd.ceph-1sunilkumar-ol18l6-node6.ueawqa_TCP_10.0.213.158_4420
namespace_nqn.2016-06.io.spdk:test_cli_2
omap_version
qos_nqn.2016-06.io.spdk:test_cli_2
subsystem_nqn.2016-06.io.spdk:test_cli
PepperJo commented 10 months ago

I agree that when we remove a GW through ceph adm we should remove all the GW specific state in the OMAP. I would not remove the entire OMAP when the last GW of a GW group is deleted but would introduce another command that explicitly allows removing a GW group.

pcuzner commented 10 months ago

Isn't the way the service is deleted a behaviour determined by cephadm? If so, this issue needs to be raised under ceph/ceph for discussion with the cephadm maintainers right? For example, cephadm implements the nvmeof via a class which already has a post_remove method ... but it's empty :)

caroav commented 10 months ago

yes @pcuzner we do need to involve cephadm, but in this discussion we try to agree about the expected behavior. Also, I think that the post_remove for example, will need some kind of a CLI to perform the required cleanup.

pcuzner commented 10 months ago

I'm not clear on the CLI requirement for cleanup. For example, if the service is removed with --force (i.e. ceph orch rm nvmeof.gw1 --force) the mgr could just delete the rados objects (the class has both post_remove and purge methods).