seaweedfs / seaweedfs-csi-driver

SeaweedFS CSI Driver https://github.com/seaweedfs/seaweedfs
Apache License 2.0
218 stars 50 forks source link

Timeout waiting for mount Kubernetes pvc #35

Open IxDay opened 3 years ago

IxDay commented 3 years ago

Installed seaweedfs operator with certmanager: https://github.com/seaweedfs/seaweedfs-operator#installation Installed the CSI driver: https://github.com/seaweedfs/seaweedfs-csi-driver

Running a simple cluster with the following setup for the operator:

apiVersion: seaweed.seaweedfs.com/v1
kind: Seaweed
metadata:
  name: seaweed
  namespace: seaweedfs
spec:
  # Add fields here
  image: chrislusf/seaweedfs:2.70
  volumeServerDiskCount: 1
  hostSuffix: seaweed.abcdefg.com
  master:
    replicas: 1
    volumeSizeLimitMB: 1024
  volume:
    replicas: 1
    requests:
      storage: 2Gi
  filer:
    replicas: 1
    config: |
      [leveldb2]
      enabled = true
      dir = "/data/filerldb2"

All pods are running fine for the setup. However, when I try to setup the sample config:

kubectl apply -f deploy/kubernetes/sample-seaweedfs-pvc.yaml
kubectl get pvc
kubectl apply -f deploy/kubernetes/sample-busybox-pod.yaml

PVC is properly created:

NAME                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
seaweedfs-csi-pvc   Bound    pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6   1Gi        RWO            seaweedfs-storage   26m

But mounting fail with the following error (this is ds/csi-seaweedfs-node csi-seaweedfs-plugin):

I0928 07:36:49     1 nodeserver.go:28] NodePublishVolume volume pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 to /var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6/mount
I0928 07:36:49     1 mounter_seaweedfs.go:29] mounting seaweed-filer.seaweedfs.svc.cluster.local:8888 pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 to /var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6/mount
I0928 07:36:49     1 mounter.go:29] Mounting fuse with command: weed and args: [mount -dirAutoCreate=true -umask=000 -dir=/var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6/mount -collection=pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 -filer=seaweed-filer.seaweedfs.svc.cluster.local:8888 -filer.path=/buckets/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 -cacheCapacityMB=1000 -concurrentWriters=32 -cacheDir=/tmp]
E0928 07:37:00     1 mounter_seaweedfs.go:68] mount seaweed-filer.seaweedfs.svc.cluster.local:8888 pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 to /var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6/mount: Timeout waiting for mount
E0928 07:37:00     1 utils.go:56] GRPC error: rpc error: code = Internal desc = Timeout waiting for mount
chrislusf commented 3 years ago
  1. volumeSizeLimitMB=1024 seems big for 2GB disk space. Reduce it so that more volumes can be created.
  2. Try to run weed mount -dirAutoCreate=true -umask=000 -dir=/var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6/mount -collection=pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 -filer=seaweed-filer.seaweedfs.svc.cluster.local:8888 -filer.path=/buckets/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 -cacheCapacityMB=1000 -concurrentWriters=32 -cacheDir=/tmp directly via command line, and see what is the problem.
IxDay commented 3 years ago

got the following error:

weed mount -dirAutoCreate=true -umask=000 -dir="/var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes
/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6/mount" -collection=pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a
6 -filer=seaweed-filer.seaweedfs.svc.cluster.local:8888 -filer.path=/buckets/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6
-cacheCapacityMB=1000 -concurrentWriters=32 -cacheDir=/tmp

E0928 16:02:07  2937 mount_std.go:104] failed to retrieve inode for parent directory of /var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6/mount: stat /var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6: no such file or director

And the missing directory is actually the /pvc-43fb613e-cc87-4c79-b43e-0d7c8fd691a6 one (everything up to /var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/ exists)

If I ls the directory there is nothing there:

/ # ls /var/lib/kubelet/pods/02b85831-522b-4f19-9d89-621e8f86e938/volumes/kubernetes.io~csi/
/ #
chrislusf commented 3 years ago

"dirAutoCreate" should create that folder. Maybe it's because of the "~" sign in the path?

IxDay commented 3 years ago

I am not well versed into csi internals. I put a rather default config and idk from where this ~ is coming from. Any idea of what config I should tune?

chrislusf commented 3 years ago

How did you setup the K8s? I am using Kind.

IxDay commented 3 years ago

I am using kubeadm with a fairly standard config (nothing fancy)

IxDay commented 3 years ago

I am getting back to you because I am trying to troubleshoot what is going on. During the investigation I noticed a handful of errors in the log which are:

I1028 04:36:14     1 masterclient.go:119] master masterClient failed to receive from seaweed-master-0.seaweed-master-peer.seaweedfs:9333: rpc error: code = Unavailable desc = closing transport due to: connection error: desc = "error reading from server: EOF", received prior goaway: code: NO_ERROR

I decided to check the DNS resolution of seaweed-master-0.seaweed-master-peer.seaweedfs which fails. Looking at the Kubernetes way of giving pod addresses I see no mention of custom template. Looking into the controller code I found out the template you are using and how you build addresses. My question may be dumb but how the pod is supposed to resolve this custom template? I did not find any other mention of it within the code or config. Can you point me to this part of the logic?

chrislusf commented 3 years ago

I vaguely remember that in k8s, the DNS resolution depends on the ordering. A server started earlier can be visible by later containers.

IxDay commented 3 years ago

Unless we are doing some stuff with the Kubernetes DNS somewhere within the code (if that is the case can you point me to the code?), I don't think this kind of record can ever be resolved by Kubernetes

chrislusf commented 3 years ago

No code to deal with DNS.

The master nodes are started as statefulset. https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

IxDay commented 3 years ago

Sorry for the confusion it seems that I was hitting a bug from alpine/busybox using dnslookup: https://github.com/docker-library/busybox/issues/48 Continuing my investigation