Closed pascalhobus closed 2 years ago
@hoba84 can you please paste the configmap output which contains the clusterID and mon mapping? can you also try running commands like ceph -s
or rbd ls -p mycluster-hdd
from csi-rbdplugin container of the provisioner pod?
Note- you need to pass monitor, user, key
parameters when executing the above ceph/rbd commands in the rbd-plugin container.
Hi @Madhu-1, these are the two config maps for ceph-config
and ceph-csi-config
:
❯ kubectl get configmap ceph-config -n default -o yaml apiVersion: v1 data: ceph.conf: | [global] auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx keyring: "" kind: ConfigMap metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","data":{"ceph.conf":"[global]\nauth_cluster_required = cephx\nauth_service_required = cephx\nauth_client_required = cephx\n","keyring":""},"kind":"ConfigMap","metadata":{"annotations":{},"name":"ceph-config","namespace":"default"}} creationTimestamp: "2021-12-14T08:31:58Z" name: ceph-config namespace: default resourceVersion: "17148057" selfLink: /api/v1/namespaces/default/configmaps/ceph-config uid: d398daaf-276f-405d-9738-9f63588243cd
❯ kubectl get configmap ceph-csi-config -n default -o yaml apiVersion: v1 data: config.json: |- [ { "clusterID": "d2383c4d-1fad-4701-90d9-b5079b403737", "monitors": [ "10.1.30.1:6789", "10.1.30.2:6789", "10.1.30.3:6789" ] } ] kind: ConfigMap metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","data":{"config.json":"[\n {\n \"clusterID\": \"d2383c4d-1fad-4701-90d9-b5079b403737\",\n \"monitors\": [\n \"10.1.30.1:6789\",\n \"10.1.30.2:6789\",\n \"10.1.30.3:6789\"\n ]\n }\n]"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"ceph-csi-config","namespace":"default"}} creationTimestamp: "2021-12-14T08:31:58Z" name: ceph-csi-config namespace: default resourceVersion: "17148055" selfLink: /api/v1/namespaces/default/configmaps/ceph-csi-config uid: c68d6bdb-bbc0-43f3-8910-c0d0452f552f
For our suggested debugging I run ❯ kubectl exec -i -t csi-rbdplugin-provisioner-6bbfdc7c78-5mrgq --container csi-rbdplugin -- /bin/bash
but in that container I cannot even ping any monitor (e.g. 10.1.30.1).
Also, I do not know how to pass monitor, user and key to that container.
rbd ls -p mycluster-hdd -m 10.1.30.1:6789 --user <username> --key
the command goes like this you need to pass the user and key as per your cluster.
To check network connectivity you can run the below command in the csi-rbdplugin container
cat < /dev/tcp/10.1.30.1/6789
If above command does not return any value like ceph v027���'e����'^C
it could be network issue.
Hi @Madhu-1, the above command rbd ls -p [...]
does not return anything in my case (it gets stuck). Also cat < /dev/tcp/10.1.30.1/6789
does not return anything. There is no /dev/tcp*
device listed when I do ls -la /dev
. As the container comes from ceph-csi, what prerequisites does it need for networking?
The only speciality in my setup is, that the kubernetes nodes have multiple network interfaces. One (10.1.30.0/24) is dedicated to ceph access. In the other container (pod:csi-rbdplugin with container:csi-rbdplugin) I can run the above commands to access ceph cluster.
@hoba84 we don't put any requirement for networking, the pre-req is that the ceph cluster should be reachable from the csi pods.
Hi @Madhu-1, I could solve the problem and indeed it was missing network connectivity inside the provisioner pod. I've resolved that by configuring with help of multus a network interface - now I was able to use the PVC. Maybe the documentation and/or logging could be improved. Thanks again for your valuable hep!
Describe the bug
Following this guide, the PVC I want to create gets stuck at status
Pending
. with the following detail messageAborted desc = an operation with the given Volume ID pvc-55b22d71-055e-4f23-9e61-7386e4cde6c5 already exists
. I don't know how to further debug this. One thing which I found out is, that from the csi-rbdplugin-provisioner/csi-rbdplugin containers I cannot reach ceph cluster. From csi-rbdplugin/csi-rbdplugin containers I can reach the ceph cluster (see sectionAdditional Context
). I don't know if this is how it should be (?).Environment details
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : ceph-common (?)The Ceph cluster is hosted on a Proxmox 7 environment. The MicroK8S nodes are running on Ubuntu 20.04 VMs on Proxmox. MicroK8S is running in HA mode (3 nodes) which means that Calico CNI is used. Enabled add-ons for microk8s are: dns, ha-cluster, ingress, metallb, metrics-server, rbac.
Steps to reproduce
Same as in this guide:
Actual results
The PVC is stuck in status pending.
Expected behavior
The PVC should be created successfully.
Additional context
When
kubectl exec -i -t csi-rbdplugin-provisioner-6bbfdc7c78-5mrgq --container csi-rbdplugin -- /bin/bash
, from this container I cannot reach my ceph cluster (ping does not work). My network interfaces in this container are the following:When
kubectl exec -i -t csi-rbdplugin-cfmfh --container csi-rbdplugin -- /bin/bash
, from this container I can reach my ceph cluster. The network interfaces are the following (which are the same as in the host machinek8s-node-[1..3]
:Logs
If the issue is in PVC creation, deletion, cloning please attach complete logs of below containers.
csi-rbdplugin-provisioner/csi-rbdplugin
[...]
csi-rbdplugin-provisioner/csi-provisioner
If the issue is in PVC resize please attach complete logs of below containers.
If the issue is in snapshot creation and deletion please attach complete logs of below containers.
If the issue is in PVC mounting please attach complete logs of below containers.
csi-rbdplugin/csi-cephfsplugin and driver-registrar container logs from plugin pod from the node where the mount is failing. no issue with mounting (do not come to that point)
if required attach dmesg logs. dmesg of csi-rbdplugin-cfmfh/csi-rbdplugin:
dmesg of csi-rbdplugin-provisioner-6bbfdc7c78-5mrgq/csi-rbdplugin