SynologyOpenSource / synology-csi

Apache License 2.0
491 stars 103 forks source link

Volume directoy on host quickly disappears after mount #4

Open phihos opened 2 years ago

phihos commented 2 years ago

Hi,

the volume on my DiskStation (2x 920+ in HA) is provisioned correctly and attached as block device (/dev/sdd in my case) on the host system. But it is expected to be mounted in /var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d1e722ba-35e7-4222-a797-1e66eb40c755/globalmount which does not exists. Only /var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/pv exists.

On further investigation I found out that this directory shortly appears and disappears (for roughly 1s) during this log entry:

[synology-csi-node-ppzwq csi-plugin] 2021-10-03T21:32:52Z [INFO] [driver/utils.go:104] GRPC call: /csi.v1.Node/NodeGetCapabilities 
[synology-csi-node-ppzwq csi-plugin] 2021-10-03T21:32:52Z [INFO] [driver/utils.go:105] GRPC request: {} 
[synology-csi-node-ppzwq csi-plugin] 2021-10-03T21:32:52Z [INFO] [driver/utils.go:110] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":3}}}]} 
[synology-csi-node-ppzwq csi-plugin] 2021-10-03T21:32:52Z [INFO] [driver/utils.go:104] GRPC call: /csi.v1.Node/NodeStageVolume 
[synology-csi-node-ppzwq csi-plugin] 2021-10-03T21:32:52Z [INFO] [driver/utils.go:105] GRPC request: {"staging_target_path":"/var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d1e722ba-35e7-4222-a797-1e66eb40c755/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"dsm":"192.168.2.210","storage.kubernetes.io/csiProvisionerIdentity":"1633291144215-8081-csi.san.synology.com"},"volume_id":"080af020-d433-4ea3-aa2a-1773a9132e3f"} 
[synology-csi-node-ppzwq csi-plugin] 2021-10-03T21:32:52Z [INFO] [driver/initiator.go:109] Session[iqn.2000-01.com.synology:Hossnercloud-HA.pvc-d1e722ba-35e7-4222-a797-1e66eb40c755] already exists. 
[synology-csi-node-ppzwq csi-plugin] 2021-10-03T21:32:53Z [ERROR] [driver/utils.go:108] GRPC error: rpc error: code = Internal desc = stat /var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d1e722ba-35e7-4222-a797-1e66eb40c755/globalmount: no such file or directory 

At this point I do not know how to debug this further. Possibly a hidden error during mounting.

I am on MicroK8S 1.21 by the way.

Edit: The block device should already be formatted with ext4 at this point but they are not. This is a probably cause of the mount failing. What might hahve skipped formatting during provisioning? Edit2: Manually formatting /dev/sdd with ext4 did not help the csi-plugin with mounting, but I was able to mount is manually. This does not seem to be recognized by K8S though.

chihyuwu commented 2 years ago

Hi,

On further investigation I found out that this directory shortly appears and disappears (for roughly 1s) during this log entry

Actually, this directory /var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d1e722ba-35e7-4222-a797-1e66eb40c755/globalmount never appears during NodeStageVolume. It's weird because this directory should be created by kubernetes before the CSI going to do NodeStageVolume.

Before format and mount, the CSI would check if staging_target_path was mount or not, but in your case, it couldn't find the path. So the error you encountered might not be caused by mounting, but the mount point checking: https://github.com/SynologyOpenSource/synology-csi/blob/dc05a795b79b911ec5882c3c837a7779cf3576a8/pkg/driver/nodeserver.go#L172

I'm not sure if these can help for further debugging:

  1. check the status of the pod you created, using kubectl describe
  2. check the logs of csi-attacher and other containers in csi-controller
jhueppauff commented 2 years ago

Did you get this working? I am facing the same issue with microk8s.

chihyuwu commented 2 years ago

Hi! Try to change the value of the kubelet-dir mountPath from /var/lib/kubelet to /var/snap/microk8s/common/var/lib/kubelet, and reinstall the csi driver again (./scripts/uninstall.sh; ./scripts/deploy.sh install -b). https://github.com/SynologyOpenSource/synology-csi/blob/fc3359223fe51a13bcfa5a7cabbf59611bbeb901/deploy/kubernetes/v1.20/node.yml#L103-L106

We're still working on providing an easier configuration method to make CSI compatible with more Kubernetes or other container orchestrations.

jhueppauff commented 2 years ago

This is my working node.yml file:

---
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: synology-csi-node
  namespace: synology-csi
spec:
  selector:
    matchLabels:
      app: synology-csi-node
  template:
    metadata:
      labels:
        app: synology-csi-node
    spec:
      serviceAccount: csi-node-sa
      hostNetwork: true
      containers:
        - name: csi-driver-registrar
          securityContext:
            privileged: true
          imagePullPolicy: Always
          image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.3.0
          args:
            - --v=5
            - --csi-address=$(ADDRESS)                         # the csi socket path inside the pod
            - --kubelet-registration-path=$(REGISTRATION_PATH) # the csi socket path on the host node
          env:
            - name: ADDRESS
              value: /csi/csi.sock
            - name: REGISTRATION_PATH
              value: /var/snap/microk8s/common/var/lib/kubelet/plugins/csi.synology.com/csi.sock
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          volumeMounts:
            - name: plugin-dir
              mountPath: /csi
            - name: registration-dir
              mountPath: /registration
        - name: csi-plugin
          securityContext:
            privileged: true
          imagePullPolicy: IfNotPresent
          image: synology/synology-csi:v1.1.0
          args:
            - --nodeid=$(KUBE_NODE_NAME)
            - --endpoint=$(CSI_ENDPOINT)
            - --client-info
            - /etc/synology/client-info.yml
            - --log-level=info
          env:
            - name: CSI_ENDPOINT
              value: unix://csi/csi.sock
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          volumeMounts:
            - name: kubelet-dir
              mountPath: /var/snap/microk8s/common/var/lib/kubelet
              mountPropagation: "Bidirectional"
            - name: plugin-dir
              mountPath: /csi
            - name: client-info
              mountPath: /etc/synology
              readOnly: true
            - name: host-root
              mountPath: /host
            - name: device-dir
              mountPath: /dev
      volumes:
        - name: kubelet-dir
          hostPath:
            path: /var/snap/microk8s/common/var/lib/kubelet
            type: Directory
        - name: plugin-dir
          hostPath:
            path: /var/snap/microk8s/common/var/lib/kubelet/plugins/csi.synology.com/
            type: DirectoryOrCreate
        - name: registration-dir
          hostPath:
            path: /var/snap/microk8s/common/var/lib/kubelet/plugins_registry
            type: Directory
        - name: client-info
          secret:
            secretName: client-info-secret
        - name: host-root
          hostPath:
            path: /
            type: Directory
        - name: device-dir
          hostPath:
            path: /dev
            type: Directory
jhughes2112 commented 2 months ago

I'm having a similar problem with k0s, because it also does not have a standard /var/lib/kubelet root folder. Instead, theirs are set to /var/lib/k0s/kubelet which I have both symlinked from /var/lib/kubelet and also tried changing the hostPath declarations to work either way. For some reason, this did not solve my problem with mounting the volume.

2024-07-01T19:03:30Z [ERROR] [driver/utils.go:108] GRPC error: rpc error: code = Internal desc = stat /var/lib/k0s/kubelet/plugins/kubernetes.io/csi/csi.san.synology.com/5ff8af875f7f2e4d721e952230420933db1c42ee5f43c50484fc7119175cbff2/globalmount: no such file or directory

I have verified that the PVC created the iscsi target, have manually mounted (to /mnt/test not the proper location) and formatted and created a file on that volume on the same node that is running this pod. So I know the file system formatter and mounting should work. But... this is the only error I get from the csi-plugin pod. For what it's worth, I have also done a find searching all the folders for this volume being mounted somewhere else... and it isn't.

I've been at this all day. Anyone know what I should look at next?

jhughes2112 commented 2 months ago

Ahh... looking at my yaml and the one posted above, it's not just the hostPath but also the mountPath that need to have the full host path folder updated. Working now.