digitalocean / csi-digitalocean

A Container Storage Interface (CSI) Driver for DigitalOcean Block Storage
Apache License 2.0
577 stars 107 forks source link

timeout expired waiting for volumes to attach or mount for pod #124

Closed ap1969 closed 5 years ago

ap1969 commented 5 years ago

What did you do? (required. The issue will be closed when not provided.)

Tried to replicate the example to attach an existing volume to a deployment.

doctl compute volume list

ID                                      Name              Size     Region    Filesystem Type    Filesystem Label    Droplet IDs
911983ce-240a-11e9-9975-XXXXXXXXXX    volume-lon1-01    5 GiB    lon1      ext4                                   <nil>

The volume is detached from any droplets

pv.yml:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: volume-lon1-01
  annotations:
    pv.kubernetes.io/provisioned-by: dobs.csi.digitalocean.com
spec:
  storageClassName: do-block-storage
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  csi:
    driver: dobs.csi.digitalocean.com
    fsType: ext4
    volumeHandle: 911983ce-240a-11e9-9975-XXXXXXXXXX
    volumeAttributes:
      com.digitalocean.csi/noformat: "true"

kubectl apply -f pv.yml

kubectl get pv

NAME             CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS       REASON   AGE
volume-lon1-01   5Gi        RWO            Retain           Available           do-block-storage            5s

pvc.yml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csi-deployment-pvc
spec:
  accessModes:
        - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: do-block-storage

kubectl apply -f pvc.yml

kubectl get pvc

NAME                 STATUS   VOLUME           CAPACITY   ACCESS MODES   STORAGECLASS       AGE
csi-deployment-pvc   Bound    volume-lon1-01   5Gi        RWO            do-block-storage   5s

deployment.yml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-csi-app
spec:
  selector:
    matchLabels:
      app: my-csi-app
  replicas: 1
  template:
    metadata:
      labels:
        app: my-csi-app
    spec:
      containers:
        - name: my-frontend
          image: busybox
          volumeMounts:
          - mountPath: "/data"
            name: my-do-volume
          command: [ "sleep", "1000000" ]
      volumes:
        - name: my-do-volume
          persistentVolumeClaim:
            claimName: csi-deployment-pvc

kubectl apply -f deployment.yml

kubectl get deployment

NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
my-csi-app   1         1         1            0           2m

kubectl describe deployment my-csi-app

Name:                   my-csi-app
Namespace:              default
CreationTimestamp:      Tue, 29 Jan 2019 22:12:00 +0000
Labels:                 <none>
Annotations:            deployment.kubernetes.io/revision: 1
                        kubectl.kubernetes.io/last-applied-configuration:
                          {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"my-csi-app","namespace":"default"},"spec":{"replicas":1,"...
Selector:               app=my-csi-app
Replicas:               1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=my-csi-app
  Containers:
   my-frontend:
    Image:      busybox
    Port:       <none>
    Host Port:  <none>
    Command:
      sleep
      1000000
    Environment:  <none>
    Mounts:
      /data from my-do-volume (rw)
  Volumes:
   my-do-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  csi-deployment-pvc
    ReadOnly:   false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      False   MinimumReplicasUnavailable
  Progressing    True    ReplicaSetUpdated
OldReplicaSets:  <none>
NewReplicaSet:   my-csi-app-74dc5d9cdc (1/1 replicas created)
Events:
  Type    Reason             Age    From                   Message
  ----    ------             ----   ----                   -------
  Normal  ScalingReplicaSet  5m43s  deployment-controller  Scaled up replica set my-csi-app-74dc5d9cdc to 1

kubectl get pods

NAME                          READY   STATUS              RESTARTS   AGE
my-csi-app-74dc5d9cdc-4sjfh   0/1     ContainerCreating   0          3m

kubectl describe pod my-csi-app-74dc5d9cdc-4sjfh

Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               notifium-rancheragent-0-all/206.189.118.63
Start Time:         Tue, 29 Jan 2019 22:12:00 +0000
Labels:             app=my-csi-app
                    pod-template-hash=3087185787
Annotations:        <none>
Status:             Pending
IP:
Controlled By:      ReplicaSet/my-csi-app-74dc5d9cdc
Containers:
  my-frontend:
    Container ID:
    Image:         busybox
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      1000000
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /data from my-do-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-24vzn (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  my-do-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  csi-deployment-pvc
    ReadOnly:   false
  default-token-24vzn:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-24vzn
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason              Age                    From                                  Message
  ----     ------              ----                   ----                                  -------
  Normal   Scheduled           7m6s                   default-scheduler                     Successfully assigned default/my-csi-app-74dc5d9cdc-4sjfh to notifium-rancheragent-0-all
  Warning  FailedAttachVolume  4m34s (x2 over 6m51s)  attachdetach-controller               AttachVolume.Attach failed for volume "volume-lon1-01" : attachment timeout for volume 911983ce-240a-11e9-9975-0a58ac14c02a
  Warning  FailedMount         2m48s (x2 over 5m3s)   kubelet, notifium-rancheragent-0-all  Unable to mount volumes for pod "my-csi-app-74dc5d9cdc-4sjfh_default(e682b141-2412-11e9-8874-1283109c8102)": timeout expired waiting for volumes to attach or mount for pod "default"/"my-csi-app-74dc5d9cdc-4sjfh". list of unmounted volumes=[my-do-volume]. list of unattached volumes=[my-do-volume default-token-24vzn]

The volume is still detached from any droplets

What did you expect to happen?

The deployment would launch successfully and the volume would be mounted onto the pod's filesystem

Configuration (MUST fill this out):

As above

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.6", GitCommit:"b1d75deca493a24a2f87eb1efde1a569e52fc8d9", GitTreeState:"clean", BuildDate:"2018-12-16T04:30:10Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

welcome[bot] commented 5 years ago

Thank you for creating the issue! One of our team members will get back to you shortly with additional information.

ap1969 commented 5 years ago

How do I share the gist with you, please?

ap1969 commented 5 years ago

I tried again with a clean cluster not using Rancher, and therefore using v1.0.0 of the plugin, and it could connect OK. So I think this is an issue with v0.2.0.

ChSch3000 commented 5 years ago

I'm having the same issue. I use the DigitalOcean managed kubernetes. After I recycled i kubernetes node, the csi driver is unable to attach the volume.

doctl compute volume list output:

D                                      Name                                        Size       Region    Filesystem Type    Filesystem Label    Droplet IDs
0badc4dd-459f-11e9-985e-0a58ac14d087    pvc-08ff212a-459f-11e9-9a02-def1f971d450    8 GiB      fra1      ext4
355986e7-3522-11e9-862a-0a58ac14d0bb    pvc-2c6ceb84-3522-11e9-9a02-def1f971d450    250 GiB    fra1      ext4
6ab8e875-3510-11e9-985e-0a58ac14d087    pvc-41a4c31b-350e-11e9-9a02-def1f971d450    8 GiB      fra1      ext4                                   [133696412]
52abc650-350d-11e9-985e-0a58ac14d087    pvc-4959f67f-350d-11e9-9a02-def1f971d450    12 GiB     fra1      ext4
51206c7d-351c-11e9-985e-0a58ac14d087    pvc-50094c5b-351c-11e9-9a02-def1f971d450    12 GiB     fra1      ext4                                   [133696412]
61dfbf21-463b-11e9-985e-0a58ac14d087    pvc-ca0c5b40-463a-11e9-9a02-def1f971d450    2 GiB      fra1      ext4                                   [133696411]
df5e953c-463a-11e9-985e-0a58ac14d087    pvc-ca0e515f-463a-11e9-9a02-def1f971d450    8 GiB      fra1      ext4                                   [133696411]
e0611c39-459f-11e9-985e-0a58ac14d087    pvc-df9decfd-459f-11e9-9a02-def1f971d450    10 GiB     fra1      ext4                                   [133696412]

And here the events from the pod:

Events:
  Type     Reason              Age   From                            Message
  ----     ------              ----  ----                            -------
  Normal   Scheduled           2m    default-scheduler               Successfully assigned ioneaccess/pros-79694f95db-d9zsk to charming-pasteur-uof4
  Warning  FailedAttachVolume  1m    attachdetach-controller         AttachVolume.Attach failed for volume "pvc-2c6ceb84-3522-11e9-9a02-def1f971d450" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Warning  FailedMount         13s   kubelet, charming-pasteur-uof4  Unable to mount volumes for pod "pros-79694f95db-d9zsk_ioneaccess(61e79900-4830-11e9-a261-229fa05727f2)": timeout expired waiting for volumes to attach or mount for pod "ioneaccess"/"pros-79694f95db-d9zsk". list of unmounted volumes=[pros]. list of unattached volumes=[pros default-token-vd2cf]

Any ideas?

timoreimann commented 5 years ago

@ChSch3000 it sounds like your issue is a different one since you're running into problems after recycling nodes, which is not what the OP reported AFAIU.

Anything but the latest version of the external-attacher side car that we provide had a bug which caused volumes to not reattach successfully under certain conditions, most notably involving recycles. If this is still a problem, please try on a more recent DOKS patch version for your release.

I'll be closing the issue in order to avoid conflating presumably non-related problems. Feel free to file a new bug report if the matter persists.

Thanks!