CEPH RBD Provisioner Creates PV that fails to attach

davesargrad commented 4 years ago

I have setup a CEPH cluster. Independent of that I have followed processes found online to setup RBD provisioning.

The process I've followed is found here: https://medium.com/velotio-perspectives/an-innovators-guide-to-kubernetes-storage-using-ceph-a4b919f4e469

This has largely worked. I have a "fast-rbd" storage class as follows:

I've created the various resources required for provisioning:

I've created the secrets to access ceph properly.

I am able to create PVC's that successfully bind to the PV created by the provisioner

However, the pod that I create fails to attach to the volume:

Looking at the pod in more detail I see the following

Googling the warning:

fail to check rbd image status with (executable file not found in $path)

I find various hits online including this one

It would seem that others have struggled with this. I dont fully understand the resolution described here. I am looking for guidance relative to getting past this problem

I believe the problem is that the running container does not include "rbd" in its path. Its not clear to me how this is properly resolved.

Guidance/Advice would be appreciated. Dave

davesargrad commented 4 years ago

Since I dont know how to solve the above problem yet, I am trying a CephFS provisioner instead of an RBD provisioner.

The PVC is not even binding:

Here is my cephfs storageclass

The corresponding claim:

The provisioner and other resources

and the secrets:

I'm just not sure how to debug this.

How do I determine why the ceph PV is not being provisioned?

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: cephfs provisioner: ceph.com/cephfs parameters: monitors: togo.corp.sensis.com:6789 adminId: admin adminSecretName: ceph-secret-admin adminSecretNamespace: cephfs claimRoot: /pvc-volumes

davesargrad commented 4 years ago

The yaml I use to create resources: `kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cephfs-provisioner namespace: cephfs rules:

apiGroups: [""] resources: ["persistentvolumes"] verbs: ["get", "list", "watch", "create", "delete"]
apiGroups: [""] resources: ["persistentvolumeclaims"] verbs: ["get", "list", "watch", "update"]
apiGroups: ["storage.k8s.io"] resources: ["storageclasses"] verbs: ["get", "list", "watch"]
apiGroups: [""] resources: ["events"] verbs: ["create", "update", "patch"]
apiGroups: [""] resources: ["services"] resourceNames: ["kube-dns","coredns"] verbs: ["list", "get"]

kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cephfs-provisioner subjects:
kind: ServiceAccount name: cephfs-provisioner namespace: cephfs roleRef: kind: ClusterRole name: cephfs-provisioner apiGroup: rbac.authorization.k8s.io

apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: cephfs-provisioner namespace: cephfs rules:
apiGroups: [""] resources: ["secrets"] verbs: ["create", "get", "delete"]
apiGroups: [""] resources: ["endpoints"] verbs: ["get", "list", "watch", "create", "update", "patch"]

apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: cephfs-provisioner namespace: cephfs roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: cephfs-provisioner subjects:
- kind: ServiceAccount name: cephfs-provisioner
  
  apiVersion: v1 kind: ServiceAccount metadata: name: cephfs-provisioner namespace: cephfs
  
  apiVersion: apps/v1 kind: Deployment metadata: name: cephfs-provisioner namespace: cephfs spec: replicas: 1 selector: matchLabels: app: cephfs-provisioner strategy: type: Recreate template: metadata: labels: app: cephfs-provisioner spec: containers:
  - name: cephfs-provisioner image: "quay.io/external_storage/cephfs-provisioner:latest" env:
  - name: PROVISIONER_NAME value: ceph.com/cephfs
  - name: PROVISIONER_SECRET_NAMESPACE value: cephfs command:
  - "/usr/local/bin/cephfs-provisioner" args:
  - "-id=cephfs-provisioner-1" serviceAccount: cephfs-provisioner `

wongma7 commented 4 years ago

Your node needs to have the rbd binary installed and in $PATH.

davesargrad commented 4 years ago

Your node needs to have the rbd binary installed and in $PATH.

Hi @wongma7 Thanks for the reply.

Which node are you referring to? Are you saying that the K8S worker node needs this? This seems a bit odd. It seems to place a burden on the configuration of a K8S worker node that is specific to RBD.

On a different note, can you take a look at my comment above ("Since I dont know how to solve the above problem yet, I am trying a CephFS provisioner instead of an RBD provisioner").

I am trying an alternative use of CEPHFS, rather than RBD. For some reason the CEPHFS provisioner fails to create a persistent volume.

I was wondering how the provisioner knows which pool to use. With the RBD provisioner, I explicitly specify a pool ("pool: kube"). However with the CEPHFS provisioner I only specify a claimRoot "/pvc-volumes". Its not clear to me how this root is mapped to a CEPH resource.

davesargrad commented 4 years ago

I'll write up my question about CEPHFS on a separate issue. I dont want it to get lost here. I'll keep this one focused on RBD.

davesargrad commented 4 years ago

Wow. I got RBD working. On Centos all i needed to do was yum install centos-common

This placed the rbd binary onto the k8s platform.

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

hellofuturecyj commented 4 years ago

do not close this issue, for it has not been fixed yet

kifeo commented 4 years ago

Hi, I had the same issue. on debian installing the ceph-common package on the node resolved the issue

kubernetes-retired / external-storage

CEPH RBD Provisioner Creates PV that fails to attach #1256

apiGroups: [""] resources: ["services"] resourceNames: ["kube-dns","coredns"] verbs: ["list", "get"]

kind: ServiceAccount name: cephfs-provisioner namespace: cephfs roleRef: kind: ClusterRole name: cephfs-provisioner apiGroup: rbac.authorization.k8s.io

apiGroups: [""] resources: ["endpoints"] verbs: ["get", "list", "watch", "create", "update", "patch"]

kind: ServiceAccount name: cephfs-provisioner

apiVersion: v1 kind: ServiceAccount metadata: name: cephfs-provisioner namespace: cephfs