Infinidat / infinibox-csi-driver

Apache License 2.0
6 stars 10 forks source link

infinibox-csi-driver not found in the list of registered CSI drivers #107

Closed cotjoey closed 3 years ago

cotjoey commented 3 years ago

Hello,

I am using Kubernetes 1.20. I create a PersistentVolume (PV) and PersistentVolumeClaim (PVC). I import an existing volume using the steps at https://support.infinidat.com/hc/en-us/articles/360008917097 (Importing an existing NFS PV).

This PV is for a Postgres DB to persist data.

When I reboot my node to test a "failure" scenario, I get an the following error in my Postgres pod log:

Warning  FailedMount       45m (x10 over 45m)    kubelet  MountVolume.SetUp failed for volume "csi-78ce24e2a4" : kubernetes.io/csi: mounter.SetUpAt failed to get CSI client: driver name infinibox-csi-driver not found in the list of registered CSI drivers

When I look at the CSI Driver list, it does exist:

$ sudo kubectl get csidriver
NAME                   ATTACHREQUIRED   PODINFOONMOUNT   MODES        AGE
infinibox-csi-driver   true             true             Persistent   12h

I also have one storage class:

$ sudo kubectl get storageclass
NAME                           PROVISIONER            RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
ibox-nfs-storageclass-retain   infinibox-csi-driver   Retain          Immediate           true                   12h

The is also a DaemonSet that the helm install command installs and runs on each node:

NAMESPACE                NAME                               DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                                                             AGE
infi                     csi-infinibox-node                 8         8         8       8            8           <none>                                                                    12h

From the message in the error log, it seems that the CSI Driver needs to be a DaemonSet running on all nodes in the cluster, but it's not what I see from the pod listing (I have 8 nodes):

infi                     pod/csi-infinibox-driver-0                     5/5     Running       10         127m
infi                     pod/csi-infinibox-node-2j7ff                   2/2     Running       6          127m
infi                     pod/csi-infinibox-node-4g9tv                   2/2     Running       4          127m
infi                     pod/csi-infinibox-node-4n5kt                   2/2     Running       4          127m
infi                     pod/csi-infinibox-node-fq9fz                   2/2     Running       4          127m
infi                     pod/csi-infinibox-node-fqzc8                   2/2     Running       4          127m
infi                     pod/csi-infinibox-node-hx9hj                   2/2     Running       4          127m
infi                     pod/csi-infinibox-node-kpgjm                   2/2     Running       4          127m
infi                     pod/csi-infinibox-node-tx5zd                   2/2     Running       4          127m

I am wondering if I am set up correctly or need additional configuration for my pods to recover after a failure using the CSI Driver.

Thank you for any help you can provide.

Joey

gadekarnitesh commented 3 years ago

Hi @cotjoey , Could you please share the PV .yaml file and kubectl describe csidriver command output.

cotjoey commented 3 years ago

Hello @gadekarnitesh ,

Here is the YAML file I use to import the existing PV (I use envsubst to replace the $VAR values):

---
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: infinibox-csi-driver
  name: $ARGO_POSTGRES_PV_NAME
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: $ARGO_POSTGRES_PV_STORAGE
  csi:
    controllerExpandSecretRef:
      name: infinibox-creds
      namespace: infi
    controllerPublishSecretRef:
      name: infinibox-creds
      namespace: infi
    driver: infinibox-csi-driver
    nodePublishSecretRef:
      name: infinibox-creds
      namespace: infi
    nodeStageSecretRef:
      name: infinibox-creds
      namespace: infi
    volumeAttributes:
      ipAddress: $ARGO_POSTGRES_PV_IP
      volPathd: $ARGO_POSTGRES_PV_VOLPATHD
      storage_protocol: $ARGO_POSTGRES_PV_STORAGEPROTOCOL
      exportID: "$ARGO_POSTGRES_PV_EXPORTID"
    volumeHandle: $ARGO_POSTGRES_PV_VOLUMEHANDLE
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ibox-nfs-storageclass-retain
  volumeMode: Filesystem

Describe on the CSIDriver:

$ sudo kubectl describe csidriver infinibox-csi-driver
Warning: storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use storage.k8s.io/v1 CSIDriver
Name:         infinibox-csi-driver
Namespace:    
Labels:       app.kubernetes.io/managed-by=Helm
Annotations:  meta.helm.sh/release-name: csi-infinibox
              meta.helm.sh/release-namespace: infi
API Version:  storage.k8s.io/v1beta1
Kind:         CSIDriver
Metadata:
  Creation Timestamp:  2021-10-19T13:57:54Z
  Managed Fields:
    API Version:  storage.k8s.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:meta.helm.sh/release-name:
          f:meta.helm.sh/release-namespace:
        f:labels:
          .:
          f:app.kubernetes.io/managed-by:
      f:spec:
        f:attachRequired:
        f:fsGroupPolicy:
        f:podInfoOnMount:
        f:volumeLifecycleModes:
    Manager:         helm
    Operation:       Update
    Time:            2021-10-19T13:57:54Z
  Resource Version:  254888
  UID:               4686b99e-2cbd-431f-945a-cd6e5f037648
Spec:
  Attach Required:    true
  Fs Group Policy:    ReadWriteOnceWithFSType
  Pod Info On Mount:  true
  Volume Lifecycle Modes:
    Persistent
Events:  <none>

Thank you, Joey

cotjoey commented 3 years ago

I tried to change my Deployment to a StatefulSet and I get the same error after I reboot the node that hosted the pod bound to the PersistentVolume (csi-fca490290c).

  Warning  FailedMount             112s (x2 over 113s)   kubelet                  MountVolume.SetUp failed for volume "csi-fca490290c" : kubernetes.io/csi: mounter.SetUpAt failed to get CSI client: driver name infinibox-csi-driver not found in the list of registered CSI drivers
  Warning  FailedMount             37s (x9 over 110s)    kubelet                  MountVolume.SetUp failed for volume "csi-fca490290c" : rpc error: code = Internal desc = Failed to mount target path '/var/lib/kubelet/pods/661737e8-1167-41e3-9a33-e895909f6f1b/volumes/kubernetes.io~csi/csi-fca490290c/mount': mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs -o hard,rsize=1024,wsize=1024 172.25.10.242:/csi-fca490290c /var/lib/kubelet/pods/661737e8-1167-41e3-9a33-e895909f6f1b/volumes/kubernetes.io~csi/csi-fca490290c/mount
Output: mount.nfs: Protocol not supported
cotjoey commented 3 years ago

Additional outputs:

$ sudo kubectl logs csi-infinibox-node-pqpp2 -n infi registrar
I1020 20:02:55.066211       1 main.go:110] Version: v1.3.0-0-g6e9fff3e
I1020 20:02:55.066274       1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I1020 20:02:55.066292       1 connection.go:151] Connecting to unix:///csi/csi.sock
I1020 20:02:55.066678       1 main.go:127] Calling CSI driver to discover driver name
I1020 20:02:55.066692       1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
I1020 20:02:55.066696       1 connection.go:181] GRPC request: {}
I1020 20:02:55.071178       1 connection.go:183] GRPC response: {"name":"infinibox-csi-driver"}
I1020 20:02:55.071561       1 connection.go:184] GRPC error: <nil>
I1020 20:02:55.071568       1 main.go:137] CSI driver name: "infinibox-csi-driver"
I1020 20:02:55.071623       1 node_register.go:51] Starting Registration Server at: /registration/infinibox-csi-driver-reg.sock
I1020 20:02:55.071766       1 node_register.go:60] Registration Server started at: /registration/infinibox-csi-driver-reg.sock
I1020 20:02:55.377013       1 main.go:77] Received GetInfo call: &InfoRequest{}
I1020 20:02:56.424857       1 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
I1020 20:02:59.670758       1 main.go:77] Received GetInfo call: &InfoRequest{}
I1020 20:03:02.836785       1 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
$ sudo kubectl logs csi-infinibox-node-pqpp2 -n infi driver
I1020 20:02:54.474347       1 main.go:51] Infinidat CSI Driver is Starting
I1020 20:02:54.474382       1 main.go:52] Log level: info
I1020 20:02:54.474388       1 main.go:60] Configuration:
I1020 20:02:54.474391       1 main.go:61]   ALLOW_XFS_UUID_REGENERATION: yes
time="2021-10-20T20:02:54Z" level=info msg="identity service registered"
time="2021-10-20T20:02:54Z" level=info msg="node service registered"
time="2021-10-20T20:02:54Z" level=info msg=serving endpoint="unix:///var/lib/kubelet/plugins/infinibox.infinidat.com/csi.sock"
I1020 20:02:55.378194       1 node.go:100] Setting NodeId 172.27.21.66
I1020 20:02:59.671648       1 node.go:100] Setting NodeId 172.27.21.66
I1020 20:03:01.424860       1 node.go:33] NodePublishVolume called with volume ID '415591479$$nfs'
I1020 20:03:01.424891       1 storageservice.go:152] verifying api client
I1020 20:03:01.424904       1 storageservice.go:159] api client is verified.
time="2021-10-20T20:03:01Z" level=info msg="Log level set to info"
time="2021-10-20T20:03:01Z" level=info msg="buildCommonService commonservice configuration done."
E1020 20:03:01.509684       1 mount_linux.go:150] Mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs -o hard,rsize=1024,wsize=1024 172.25.10.241:/csi-78ce24e2a4 /var/lib/kubelet/pods/7186262e-42c0-475a-b9b5-76b1c9e39f67/volumes/kubernetes.io~csi/csi-78ce24e2a4/mount
Output: mount.nfs: Protocol not supported

E1020 20:03:01.509756       1 nfsnode.go:85] Failed to mount source path '172.25.10.241:/csi-78ce24e2a4' : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs -o hard,rsize=1024,wsize=1024 172.25.10.241:/csi-78ce24e2a4 /var/lib/kubelet/pods/7186262e-42c0-475a-b9b5-76b1c9e39f67/volumes/kubernetes.io~csi/csi-78ce24e2a4/mount
Output: mount.nfs: Protocol not supported
ekaulberg commented 3 years ago

Hey there @cotjoey, as I assume you are an Infinidat customer let's take this through our standard support channels if you don't mind, rather than handling via community support. Can you please contact us at support@infinidat.com or open a ticket on our support site, referencing this issue?

cotjoey commented 3 years ago

@ekaulberg , yes. I just did. Thank you for the reminder.

ekaulberg commented 3 years ago

For future reference: during Infinidat support process, @cotjoey changed to RHEL 8.0 + Docker 20.10.8 and the issue disappeared. We took the action to explore further if there are any underlying issues within our control to consider in a future release - but the current issue has been mitigated.