ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.29k stars 551 forks source link

rpc error: code = InvalidArgument desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted #4563

Open wangchao732 opened 7 months ago

wangchao732 commented 7 months ago

cephcsi: v3.11.0 csi-provisioner: v4.0.0 kubernetes : v1.22.12

[csi-cephfs-secret] adminID : client.admin(base64) adminKey: xxx(base64)

ceph fs ls name: cephfs, metadata pool: store_metadata, data pools: [store-file ]

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: csi-cephfs-sc labels: app: ceph-csi-cephfs app.kubernetes.io/managed-by: Helm chart: ceph-csi-cephfs-3-canary heritage: Helm release: ceph-csi-cephfs annotations: kubesphere.io/creator: admin meta.helm.sh/release-name: ceph-csi-cephfs meta.helm.sh/release-namespace: ceph-csi-cephfs storageclass.kubesphere.io/allow-clone: 'true' storageclass.kubesphere.io/allow-snapshot: 'true' provisioner: cephfs.csi.ceph.com parameters: clusterID: xxx csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-cephfs csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-cephfs csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-cephfs fsName: cephfs pool: store-file reclaimPolicy: Delete allowVolumeExpansion: true volumeBindingMode: Immediate

ConfigMap

apiVersion: v1 data: config.json: '[{"clusterID": "80a8efd7-8ed5-4e53-bc5b-xxxx","monitors": ["192.168.13.180:6789","192.168.13.181:6789","192.168.13.182:6789"]}]' kind: ConfigMap metadata: annotations: meta.helm.sh/release-name: ceph-csi-cephfs meta.helm.sh/release-namespace: ceph-csi-cephfs creationTimestamp: "2024-04-17T02:58:01Z" labels: app: ceph-csi-cephfs app.kubernetes.io/managed-by: Helm chart: ceph-csi-cephfs-3.11.0 component: provisioner heritage: Helm release: ceph-csi-cephfs name: ceph-csi-config namespace: ceph-csi-cephfs resourceVersion: "100022091" selfLink: /api/v1/namespaces/ceph-csi-cephfs/configmaps/ceph-csi-config uid: 40ae7717-c85a-44eb-b0a1-3652b3d4dfe0

create pvc error Name:"bytebase-pvc", UID:"91dc9df3-e611-44e5-8191-b18841edabf1", APIVersion:"v1", ResourceVersion:"100120810", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs-sc": rpc error: code = InvalidArgument desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

Madhu-1 commented 7 months ago

@wangchao732 the ceph user you have created is having the required access as per https://github.com/ceph/ceph-csi/blob/devel/docs/capabilities.md#cephfs?

wangchao732 commented 7 months ago

@wangchao732 the ceph user you have created is having the required access as per https://github.com/ceph/ceph-csi/blob/devel/docs/capabilities.md#cephfs?

Thks, but get error messges

Error EINVAL: mds capability parse failed, stopped at 'fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi' of 'allow r fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi'

Madhu-1 commented 7 months ago

have you created the csi subvolumegroup in the filesystem? if not please create it and then try to create the user

wangchao732 commented 7 months ago

您是否在文件系统中创建了 CSI 子卷组?如果没有,请创建它,然后尝试创建用户

When I finish executing ceph auth, it seems to be created by default because I see the directory. mount -t ceph 192.168.13.180:6789:/ /tmpdata -o name=admin,secret=xxx pwd /tmpdata/volumes/csi

wangchao732 commented 7 months ago

您是否在文件系统中创建了 CSI 子卷组?如果没有,请创建它,然后尝试创建用户

When I finish executing ceph auth, it seems to be created by default because I see the directory. mount -t ceph 192.168.13.180:6789:/ /tmpdata -o name=admin,secret=xxx pwd /tmpdata/volumes/csi

ceph fs subvolume ls cephfs [ { "name": "volumes" } ]

wangchao732 commented 7 months ago

ceph fs subvolume info cephfs volume1 csi { "atime": "2024-04-17 20:25:14", "bytes_pcent": "0.00", "bytes_quota": 50000000000, "bytes_used": 0, "created_at": "2024-04-17 20:25:14", "ctime": "2024-04-17 20:25:14", "data_pool": "store-file", "features": [ "snapshot-clone", "snapshot-autoprotect", "snapshot-retention" ], "gid": 0, "mode": 16877, "mon_addrs": [ "192.168.13.180:6789", "192.168.13.181:6789", "192.168.13.182:6789" ], "mtime": "2024-04-17 20:25:14", "path": "/volumes/csi/volume1/5362bce4-2dc0-44b5-99d2-07aaf023b052", "pool_namespace": "", "state": "complete", "type": "subvolume", "uid": 0 } [root@Bj13-Ceph01-Dev ceph]# ceph auth get-or-create client.$USER mgr "allow rw" osd "allow rw tag cephfs metadata=$FS_NAME, allow rw tag cephfs data=$FS_NAME" mds "allow r fsname=$FS_NAME path=/volumes, allow rws fsname=$FS_NAME path=/volumes/$SUB_VOL" mon "allow r fsname=$FS_NAME" Error EINVAL: mds capability parse failed, stopped at 'fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi' of 'allow r fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi'

wangchao732 commented 7 months ago

rbd-ceph-csi same issue

failed to provision volume with StorageClass "csi-rbd-sc": rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

wangchao732 commented 7 months ago

I0418 10:12:53.258063 1 utils.go:198] ID: 30 GRPC call: /csi.v1.Identity/Probe

I0418 10:12:53.258141 1 utils.go:199] ID: 30 GRPC request: {}

I0418 10:12:53.258175 1 utils.go:205] ID: 30 GRPC response: {}

I0418 10:13:46.083975 1 utils.go:198] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e GRPC call: /csi.v1.Controller/CreateVolume

I0418 10:13:46.085127 1 utils.go:199] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-051787a2-cdad-4e98-9562-01b43ce55a8e","parameters":{"clusterID":"80a8efd7-8ed5-4e53-bc5b-f91c56300e99","csi.storage.k8s.io/pv/name":"pvc-051787a2-cdad-4e98-9562-01b43ce55a8e","csi.storage.k8s.io/pvc/name":"bytebase-pvc","csi.storage.k8s.io/pvc/namespace":"bytebase","imageFeatures":"layering","pool":"k8s-store"},"secrets":"stripped","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["discard"]}},"access_mode":{"mode":1}}]}

I0418 10:13:46.085678 1 rbd_util.go:1315] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e setting disableInUseChecks: false image features: [layering] mounter: rbd

E0418 10:13:46.106773 1 controllerserver.go:232] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e failed to connect to volume : failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

E0418 10:13:46.106850 1 utils.go:203] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e GRPC error: rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

Madhu-1 commented 7 months ago

@wangchao732 can you do rados operations with the above ceph users? rados ls etc?

wangchao732 commented 7 months ago

have you created the csi subvolumegroup in the filesystem? if not please create it and then try to create the user

I don't know what's going on, no matter how I try I get the same error. cephcsi: v3.11.0 csi-provisioner: v4.0.0 kubernetes : v1.22.12 ceph: 14.2.22

2024-04-18 18:27:52.867588 [INF] from='client.? 192.168.13.180:0/1251224851' entity='client.admin' cmd='[{"prefix": "auth get-or-create", "entity": "client.csi-rbd", "caps": ["mon", "profile rbd", "osd", "profile rbd pool=k8s-store", "mgr", "profile rbd"]}]': finished

2024-04-18 18:27:52.861771 [INF] from='client.? 192.168.13.180:0/1251224851' entity='client.admin' cmd=[{"prefix": "auth get-or-create", "entity": "client.csi-rbd", "caps": ["mon", "profile rbd", "osd", "profile rbd pool=k8s-store", "mgr", "profile rbd"]}]: dispatch

2024-04-18 18:27:01.821352 [INF] from='client.? 192.168.13.180:0/3439531549' entity='client.admin' cmd='[{"prefix": "auth rm", "entity": "client.csi-rbd"}]': finished

ceph osd lspools 1 k8s-store 2 .rgw.root 3 default.rgw.control 4 default.rgw.meta 5 default.rgw.log 6 store-file 7 store_metadata 8 default.rgw.buckets.index

wangchao732 commented 7 months ago

@wangchao732 您可以与上述 Ceph 用户进行 RADOS 操作吗? 等?rados ls

rados lspools k8s-store .rgw.root default.rgw.control default.rgw.meta default.rgw.log store-file store_metadata default.rgw.buckets.index

wangchao732 commented 7 months ago

@wangchao732 您可以与上述 Ceph 用户进行 RADOS 操作吗? 等?rados ls

rados lspools k8s-store .rgw.root default.rgw.control default.rgw.meta default.rgw.log store-file store_metadata default.rgw.buckets.index

rados ls -p k8s-store rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e

wangchao732 commented 7 months ago

Executed in a k8s cluster:

rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

Madhu-1 commented 7 months ago

Executed in a k8s cluster:

rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

wangchao732 commented 7 months ago

Executed in a k8s cluster: rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e

Madhu-1 commented 7 months ago

Executed in a k8s cluster: rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e

can you also check if you are able to do write write operation in the pool you are planning to use https://docs.ceph.com/en/latest/man/8/rados/#examples

wangchao732 commented 7 months ago

Executed in a k8s cluster: rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e

can you also check if you are able to do write write operation in the pool you are planning to use https://docs.ceph.com/en/latest/man/8/rados/#examples

The test was successful. rados -p k8s-store put testfile test.json --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd [root@bj11-bcld-k8s01 k8s]# rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info testfile rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e

kind: Secret apiVersion: v1 metadata: name: csi-rbd-secret namespace: ceph-csi-rbd annotations: kubesphere.io/creator: admin data: userID: Y3NpLXJiZAo= userKey: QVFDbzlTQm1SR1pnTXhBQTFiYXhsNjVyYTZENjRSUVErbUVuZmc9PQo= type: Opaque

wangchao732 commented 7 months ago

[root@bj11-bcld-k8s01 k8s]# cat /etc/ceph/ceph.client.csi-rbd.keyring [client.csi-rbd] key = AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== caps mgr = "profile rbd" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store" [root@bj11-bcld-k8s01 k8s]# echo "AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg=="|base64 QVFDbzlTQm1SR1pnTXhBQTFiYXhsNjVyYTZENjRSUVErbUVuZmc9PQo= image

try userID base64 client.csi-rbd,same error.

wangchao732 commented 7 months ago

csi-rbdplugin log:

I0418 12:04:35.713123 1 utils.go:198] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 GRPC call: /csi.v1.Controller/CreateVolume

I0418 12:04:35.713486 1 utils.go:199] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9","parameters":{"clusterID":"80a8efd7-8ed5-4e53-bc5b-f91c56300e99","csi.storage.k8s.io/pv/name":"pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9","csi.storage.k8s.io/pvc/name":"bytebase-pvc","csi.storage.k8s.io/pvc/namespace":"bytebase","imageFeatures":"layering","pool":"k8s-store"},"secrets":"stripped","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["discard"]}},"access_mode":{"mode":1}}]}

I0418 12:04:35.713771 1 rbd_util.go:1315] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 setting disableInUseChecks: false image features: [layering] mounter: rbd

E0418 12:04:35.750268 1 controllerserver.go:232] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 failed to connect to volume : failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

E0418 12:04:35.750355 1 utils.go:203] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 GRPC error: rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

nixpanic commented 7 months ago

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes?

You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

wangchao732 commented 7 months ago

Is there a way to check the Ceph MON/OSD logs for denied connection requests? Maybe the problem isn't with the credentials, but with the network configuration of the pod/node?

Oh,ceph-mon.log find error, but entry client.csi-rbd exist.

2024-04-19 17:25:30.090 7f6b31027700 0 cephx server client.csi-rbd : couldn't find entity name: client.csi-rbd

client.csi-rbd key: AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== caps: [mgr] profile rbd caps: [mon] profile rbd caps: [osd] profile rbd pool=k8s-store

ceph auth list | grep client.csi-rbd installed auth entries:

client.csi-rbd client.csi-rbd-node client.csi-rbd-provisioner

Madhu-1 commented 7 months ago

adminID : client.admin(base64) adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

wangchao732 commented 7 months ago

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes?

You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

adminID : client.admin(base64) adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

in container

sh-4.4# ceph [errno 1] RADOS permission error (error connecting to the cluster) sh-4.4# rados ls - -p k8s-store 2024-04-19T09:45:32.968+0000 7f7f2dba2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] failed to fetch mon config (--no-mon-config to skip) sh-4.4# ls bin csi dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var sh-4.4# cd /etc/ceph sh-4.4# ls ceph.conf keyring sh-4.4# cat keyring sh-4.4# cat ceph.conf [global] fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99 mon initial members = 192.168.13.180,192.168.13.181,192.168.13.182 mon host = 192.168.13.180,192.168.13.181,192.168.13.182 mon addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

wangchao732 commented 7 months ago

client.

Yes, base64 has been modified to not include clinet., the issue is currently being encountered.

Is there a way to check the Ceph MON/OSD logs for denied connection requests? Maybe the problem isn't with the credentials, but with the network configuration of the pod/node?

Oh,ceph-mon.log find error, but entry client.csi-rbd exist.

2024-04-19 17:25:30.090 7f6b31027700 0 cephx server client.csi-rbd : couldn't find entity name: client.csi-rbd

client.csi-rbd key: AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== caps: [mgr] profile rbd caps: [mon] profile rbd caps: [osd] profile rbd pool=k8s-store

ceph auth list | grep client.csi-rbd installed auth entries:

client.csi-rbd client.csi-rbd-node client.csi-rbd-provisioner

Madhu-1 commented 7 months ago

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes? You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

adminID : client.admin(base64) adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

in container

sh-4.4# ceph [errno 1] RADOS permission error (error connecting to the cluster) sh-4.4# rados ls - -p k8s-store 2024-04-19T09:45:32.968+0000 7f7f2dba2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] failed to fetch mon config (--no-mon-config to skip) sh-4.4# ls bin csi dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var sh-4.4# cd /etc/ceph sh-4.4# ls ceph.conf keyring sh-4.4# cat keyring sh-4.4# cat ceph.conf [global] fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99 mon initial members = 192.168.13.180,192.168.13.181,192.168.13.182 mon host = 192.168.13.180,192.168.13.181,192.168.13.182 mon addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

You need to provide --id and keyring.

wangchao732 commented 7 months ago

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes? You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

adminID : client.admin(base64) adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

in container sh-4.4# ceph [errno 1] RADOS permission error (error connecting to the cluster) sh-4.4# rados ls - -p k8s-store 2024-04-19T09:45:32.968+0000 7f7f2dba2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] failed to fetch mon config (--no-mon-config to skip) sh-4.4# ls bin csi dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var sh-4.4# cd /etc/ceph sh-4.4# ls ceph.conf keyring sh-4.4# cat keyring sh-4.4# cat ceph.conf [global] fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99 mon initial members = 192.168.13.180,192.168.13.181,192.168.13.182 mon host = 192.168.13.180,192.168.13.181,192.168.13.182 mon addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

You need to provide --id and keyring.

Where is it configured?

Madhu-1 commented 7 months ago

rados ls - -p k8s-store

rados ls --p=k8s-store -m=192.168.13.180:6789 --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg==

wangchao732 commented 7 months ago

rados ls - -p k8s-store

rados ls --p=k8s-store -m=192.168.13.180:6789 --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg==

sh-4.4# rados ls -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config 2024-04-19T10:31:18.839+0000 7f5767fff700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] 2024-04-19T10:31:18.839+0000 7f576fb47700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] couldn't connect to cluster: (1) Operation not permitted sh-4.4# rados ls -c /etc/ceph/ceph.conf -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config 2024-04-19T10:31:44.122+0000 7f9fc9ca5700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] couldn't connect to cluster: (1) Operation not permitted sh-4.4#

cephx server client.csi-rbd: unexpected key: req.key=a128c0dd1c3cd379 expected_key=c3c8ea50fbc36e10 2024-04-19 18:35:09.605 7f6b31027700 0 cephx server client.csi-rbd : couldn't find entity name: client.csi-rbd

Madhu-1 commented 7 months ago

couldn't connect to cluster: (1) Operation not permitted

This is the exact problem, please check with ceph for it, not able to help with debugging on this issue as nothing seems to be wrong with csi.

wangchao732 commented 7 months ago

r client.csi-rbd: unexpected key: req.key=a128c0dd1c3cd379 expected_key=c3c8ea50fbc36e10 The key does not match,But my proofreading is consistent.

cephx server client.csi-rbd: unexpected key: req.key=a128c0dd1c3cd379 expected_key=c3c8ea50fbc36e10 2024-04-19 18:35:09.605 7f6b31027700 0 cephx server client.csi-rbd : couldn't find entity name: client.csi-rbd image

Madhu-1 commented 7 months ago

@wangchao732 i think you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

wangchao732 commented 7 months ago

@wangchao732 i think you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output Currently using kubesphere, the configuration file requires base64. [root@bj11-bcld-k8s01 opt]# kubectl apply -f ceph-sc-rbd.yml Warning: resource secrets/csi-rbd-secret is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically. The request is invalid: patch: Invalid value: "map[data:map[userID:csi-rbd userKey:AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==] metadata:map[annotations:map[kubectl.kubernetes.io/last-applied-configuration:{\"apiVersion\":\"v1\",\"data\":{\"userID\":\"csi-rbd\",\"userKey\":\"AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==\"},\"kind\":\"Secret\",\"metadata\":{\"annotations\":{\"kubesphere.io/creator\":\"admin\"},\"name\":\"csi-rbd-secret\",\"namespace\":\"ceph-csi-rbd\"},\"type\":\"Opaque\"}\n]]]": error decoding from json: illegal base64 data at input byte 3 [root@bj11-bcld-k8s01 opt]# cat ceph-sc-rbd.yml kind: Secret apiVersion: v1 metadata: name: csi-rbd-secret namespace: ceph-csi-rbd annotations: kubesphere.io/creator: admin data: userID: csi-rbd userKey: AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== type: Opaque

wangchao732 commented 7 months ago

rados ls -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config

It doesn't seem to be the problem, and the command is still reported when executed in the ceph cluster.

@wangchao732 i think you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

[root@Bj13-Ceph01-Dev ceph]# rados ls -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config 2024-04-19 18:54:07.940 7fa8130169c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.csi-rbd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-19 18:54:07.941 7fa8130169c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.csi-rbd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-19 18:54:07.941 7fa8130169c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.csi-rbd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-19 18:54:07.942 7fa8028ad700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] couldn't connect to cluster: (1) Operation not permitted

wangchao732 commented 7 months ago

hink you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

The keyning connection is configured in ceph.conf and the connection is successful.

sh-4.4# rados ls - -p k8s-store --keyring /etc/ceph/keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info testfile rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e sh-4.4# cat /etc/ceph/keyring [client.csi-rbd] key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== caps mgr = "profile rbd" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store"

wangchao732 commented 7 months ago

hink you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

The keyning connection is configured in ceph.conf and the connection is successful.

sh-4.4# rados ls - -p k8s-store --keyring /etc/ceph/keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info testfile rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e sh-4.4# cat /etc/ceph/keyring [client.csi-rbd] key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== caps mgr = "profile rbd" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store"

@Madhu-1 Can you help me take a look?

Madhu-1 commented 7 months ago

the above looks good, can you try to create PVC now, and also paste the kubectl yaml output of cephfs and rbd secrets?

wangchao732 commented 7 months ago

the above looks good, can you try to create PVC now, and also paste the kubectl yaml output of cephfs and rbd secrets?

The problem remains, failed to get connection: connecting failed: rados: ret=-1, Operation not permitted.

kubectl get secret csi-rbd-secret -n ceph-csi-rbd -o yaml

apiVersion: v1 data: userID: Y3NpLXJiZAo= userKey: QVFBL3RTaG03WGtySHhBQWV4M2xVdkFGMEtkeE1aQkMxZEd1SWc9PQo= kind: Secret metadata: annotations: kubesphere.io/creator: admin creationTimestamp: "2024-04-18T06:41:33Z" name: csi-rbd-secret namespace: ceph-csi-rbd resourceVersion: "104329994" selfLink: /api/v1/namespaces/ceph-csi-rbd/secrets/csi-rbd-secret uid: 1a66d008-b05c-4c33-8d8e-da69b79b8115 type: Opaque

kubectl get configmap ceph-config -n ceph-csi-rbd -o yaml

apiVersion: v1 data: ceph.conf: | [global] fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99 mon_initial_members = 192.168.13.180,192.168.13.181,192.168.13.182 mon_host = 192.168.13.180,192.168.13.181,192.168.13.182 mon_addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx keyring: "[client.csi-rbd]\n\tkey = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==\n\tcaps mgr = \"profile rbd\"\n\tcaps mon = \"profile rbd\"\n\tcaps osd = \"profile rbd pool=k8s-store\"\n" kind: ConfigMap metadata: annotations: meta.helm.sh/release-name: ceph-csi-rbd meta.helm.sh/release-namespace: ceph-csi-rbd creationTimestamp: "2024-04-18T06:17:24Z" labels: app: ceph-csi-rbd app.kubernetes.io/managed-by: Helm chart: ceph-csi-rbd-3.11.0 component: nodeplugin heritage: Helm release: ceph-csi-rbd name: ceph-config namespace: ceph-csi-rbd resourceVersion: "101316266" selfLink: /api/v1/namespaces/ceph-csi-rbd/configmaps/ceph-config uid: ab2689c9-0b42-4f11-8c23-93c11a10e61c

kubectl get configmap ceph-csi-config -n ceph-csi-rbd -o yaml

apiVersion: v1 data: cluster-mapping.json: '[]' config.json: '[{"clusterID": "80a8efd7-8ed5-4e53-bc5b-f91c56300e99","monitors": ["192.168.13.180:6789","192.168.13.181:6789","192.168.13.182:6789"]}]' kind: ConfigMap metadata: annotations: meta.helm.sh/release-name: ceph-csi-rbd meta.helm.sh/release-namespace: ceph-csi-rbd creationTimestamp: "2024-04-18T06:17:24Z" labels: app: ceph-csi-rbd app.kubernetes.io/managed-by: Helm chart: ceph-csi-rbd-3.11.0 component: nodeplugin heritage: Helm release: ceph-csi-rbd name: ceph-csi-config namespace: ceph-csi-rbd resourceVersion: "100574540" selfLink: /api/v1/namespaces/ceph-csi-rbd/configmaps/ceph-csi-config uid: 6cb85b3a-a885-4160-8eb5-acbd2c4baf0d

Madhu-1 commented 7 months ago

sh-4.4# cat /etc/ceph/keyring [client.csi-rbd] key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== caps mgr = "profile rbd" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store"

AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== is the key in above keyring file when i decode the secret am getting different one.

$echo QVFBL3RTaG03WGtySHhBQWV4M2xVdkFGMEtkeE1aQkMxZEd1SWc9PQo=|base64 -d AQA/tShm7XkrHxAAex3lUvAF0KdxMZBC1dGuIg==

echo Y3NpLXJiZAo=|base64 -d csi-rbd

is this the right key for csi-rbd user?

wangchao732 commented 7 months ago

kubernetes: 1.22.2 to create a secrets, you must do base64, otherwise a failed error decoding from json: illegal base64 data at input byte 3 will be created.

wangchao732 commented 7 months ago

sh-4.4# cat /etc/ceph/keyring [client.csi-rbd] key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== caps mgr = "profile rbd" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store"

AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== is the key in above keyring file when i decode the secret am getting different one.

$echo QVFBL3RTaG03WGtySHhBQWV4M2xVdkFGMEtkeE1aQkMxZEd1SWc9PQo=|base64 -d AQA/tShm7XkrHxAAex3lUvAF0KdxMZBC1dGuIg==

echo Y3NpLXJiZAo=|base64 -d csi-rbd

is this the right key for csi-rbd user?

Yes, I rebuilt it later.

wangchao732 commented 7 months ago

image

Madhu-1 commented 7 months ago

caps mgr = "allow rw" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store"

can you give permission to the mgr or use csi-rbd-provisioner user in secret and see if it works?

wangchao732 commented 7 months ago

image image

caps mgr = "allow rw" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store"

can you give permission to the mgr or use csi-rbd-provisioner user in secret and see if it works?

The problem remains. image

Madhu-1 commented 7 months ago

That key is expected to work as its working on other cluster, I am out of the idea sorry

wangchao732 commented 7 months ago

That key is expected to work as its working on other cluster, I am out of the idea sorry

It's okay, I'm also puzzled, guessing that it may be an exception when getting userid/userkey to connect to the cluster when calling the ceph api, but I don't have any proof for the time being.