Busy azure-disk regularly fail to mount causing K8S Pod deployments to halt.

dbalaouras commented 7 years ago

I've setup Azure Container Service with Kubernetes and I use dynamic provisioning of volumes (see details below) when deploying new Pods. Quite frequently (about 10%) I get the following error which halts the deployment:

14h 1m 439 {controller-manager } Warning FailedMount Failed to attach volume "pvc-95aa8dbf-082e-11e7-af1a-000d3a2735d9" on node "k8s-agent-1da8a8df-2" with: Attach volume "clst-west-eu-dev-dynamic-pvc-95aa8dbf-082e-11e7-af1a-000d3a2735d9.vhd" to instance "k8s-agent-1DA8A8DF-2" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'clst-west-eu-dev-dynamic-pvc-f843f8fa-0663-11e7-af1a-000d3a2735d9.vhd' to VM 'k8s-agent-1DA8A8DF-2' because the disk is currently being detached. Please wait until the disk is completely detached and then try again."

The Pod deployment then halts forever, or until I delete the Pod and let the ReplicationController create a new one.

Any idea what is causing this?

Workflow

I have created the following StorageClass:

Name:       azure-disk
IsDefaultClass: No
Annotations:    <none>
Provisioner:    kubernetes.io/azure-disk
Parameters: location=westeu,skuName=Standard_LRS,storageAccount=<<storageaccount>>

The storageaccount does contain a Blob service named vhds.

When deploying a new Pod, I create a PVC that looks like this:

{
  "apiVersion": "v1",
  "kind": "PersistentVolumeClaim",
  "Provisioner": "kubernetes.io/azure-disk",
  "metadata": {
    "name": "test-deployment-pvc",
    "annotations": {
      "volume.beta.kubernetes.io/storage-class": "azure-disk"
    },
    "labels": {
      "org": "somelabel"
    }
  },
  "spec": {
    "accessModes": [
      "ReadWriteOnce"
    ],
    "resources": {
      "requests": {
        "storage": "1Gi"
      }
    }
  }
}

and finally use the PVC in the pods:

{
  "volumes": [
    {
      "persistentVolumeClaim": {
        "claimName": "test-deployment-pvc"
      },
      "name": "storage"
    }
  ]
}

discordianfish commented 7 years ago

@rossedman As I said, I already rung 1.6.6. Or is there something different between vanilla k8s 1.6.6 and that same version in ACS?

rcconsult commented 7 years ago

Hi, Since yesterday all my StatefulSet deployments fail due to pods stuck forever in the ContainerCreating step.

Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.4", GitCommit:"7243c69eb523aa4377bce883e7c0dd76b84709a1", GitTreeState:"clean", BuildDate:"2017-03-07T23:53:09Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.4", GitCommit:"7243c69eb523aa4377bce883e7c0dd76b84709a1", GitTreeState:"clean", BuildDate:"2017-03-07T23:34:32Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

# Minion OS version
$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1298.5.0
VERSION_ID=1298.5.0
BUILD_ID=2017-02-28-0013
PRETTY_NAME="Container Linux by CoreOS 1298.5.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

NAME      READY     STATUS        RESTARTS   AGE       IP        NODE
wtt-0     0/1       Terminating   0          14h       <none>    worker1
wtt-0     0/1       Terminating   0         14h       <none>    worker1
wtt-0     0/1       Pending   0         0s        <none>
wtt-0     0/1       Pending   0         0s        <none>    worker4
wtt-0     0/1       ContainerCreating   0         0s        <none>    worker4

Looking into details I can see:

$ kl describe po/wtt-0
Name:           wtt-0
Namespace:      default
Node:           worker4/10.240.0.7
Start Time:     Wed, 09 Aug 2017 08:07:48 +0000
Labels:         app=wtt
                version=0.1
Status:         Pending
IP:
Controllers:    StatefulSet/wtt
Containers:
  wtt:
    Container ID:
    Image:              REMOVED_FOR_SECUTIRY
    Image ID:
    Port:               8080/TCP
    Command:
      /opt/jboss-eap-6.3/bin/entrypoint.sh
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Liveness:           exec [uptime] delay=150s timeout=1s period=20s #success=1 #failure=3
    Readiness:          exec [cat /etc/os-release] delay=150s timeout=1s period=20s #success=1 #failure=3
    Volume Mounts:
      /opt/jboss-eap-6.3/standalone/data from hornetqd (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-ksz81 (ro)
    Environment Variables:
      JBOSS_HOME:       /opt/jboss-eap-6.3
      WTT_ENVIRONMENT:  dev
      LANG:             en_US.UTF-8
      PROXY_HOST:       proxy.endpoints.svc.cluster.local
      PROXY_PORT:       3128
      MY_POD_NAME:      wtt-0 (v1:metadata.name)
Conditions:
  Type          Status
  Initialized   True
  Ready         False
  PodScheduled  True
Volumes:
  hornetqd:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  hornetqd-wtt-0
    ReadOnly:   false
  default-token-ksz81:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-ksz81
QoS Class:      BestEffort
Tolerations:    <none>
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath   Type            Reason          Message
  ---------     --------        -----   ----                    -------------   --------        ------          -------
  22m           22m             1       {default-scheduler }                    Normal          Scheduled       Successfully assigned wtt-0 to worker4
  22m           22m             1       {controller-manager }                   Warning         FailedMount     Failed to attach volume "pvc-xxx-xxx-xxx-xxx-00224801aff0" on node "worker4" with: Attach volume "epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" to instance "worker4" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=200 -- Original Error: Long running operation terminated with status 'Failed': Code="AcquireDiskLeaseFailed" Message="Failed to acquire lease while creating disk 'epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd' using blob with URI https://storageaccount.blob.core.windows.net/vhds/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd. Blob is already in use."
  20m           26s             10      {kubelet worker4}                       Warning         FailedMount     Unable to mount volumes for pod "wtt-0_default(d5497686-7cd9-11e7-823e-00224801aff0)": timeout expired waiting for volumes to attach/mount for pod "default"/"wtt-0". list of unattached/unmounted volumes=[hornetqd]
  20m           26s             10      {kubelet worker4}                       Warning         FailedSync      Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"wtt-0". list of unattached/unmounted volumes=[hornetqd]
  20m           6s              18      {kubelet worker4}                       Warning         FailedMount     MountVolume.MountDevice failed for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") with: mount failed: exit status 1
Mounting command: mount
Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab

On the master I can see:

I0809 08:07:48.628108       1 event.go:217] Event(api.ObjectReference{Kind:"StatefulSet", Namespace:"default", Name:"wtt", UID:"bc72d54a-7cce-11e7-823e-00224801aff0", APIVersion:"apps", ResourceVersion:"5171705", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' pet: wtt-0
I0809 08:07:48.655914       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
I0809 08:07:48.658237       1 reconciler.go:213] Started AttachVolume for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" to node "worker4"
I0809 08:07:48.699646       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
E0809 08:07:54.405634       1 azure_storage.go:65] azure attach failed, err: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=200 -- Original Error: Long running operation terminated with status 'Failed': Code="AcquireDiskLeaseFailed" Message="Failed to acquire lease while creating disk 'epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd' using blob with URI https://storageaccount.blob.core.windows.net/vhds/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd. Blob is already in use."
I0809 08:07:54.405723       1 azure_storage.go:69] failed to acquire disk lease, try detach
I0809 08:07:57.260760       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
E0809 08:08:15.053940       1 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/azure-disk/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd\"" failed. No retries permitted until 2017-08-09 08:08:15.553920954 +0000 UTC (durationBeforeRetry 500ms). Error: Failed to attach volume "pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0" on node "worker4" with: Attach volume "epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" to instance "worker4" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=200 -- Original Error: Long running operation terminated with status 'Failed': Code="AcquireDiskLeaseFailed" Message="Failed to acquire lease while creating disk 'epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd' using blob with URI https://storageaccount.blob.core.windows.net/vhds/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd. Blob is already in use."
I0809 08:08:15.054043       1 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"default", Name:"wtt-0", UID:"d5497686-7cd9-11e7-823e-00224801aff0", APIVersion:"v1", ResourceVersion:"5171708", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0" on node "worker4" with: Attach volume "epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" to instance "worker4" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=200 -- Original Error: Long running operation terminated with status 'Failed': Code="AcquireDiskLeaseFailed" Message="Failed to acquire lease while creating disk 'epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd' using blob with URI https://storageaccount.blob.core.windows.net/vhds/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd. Blob is already in use."
I0809 08:08:15.618429       1 reconciler.go:178] Started DetachVolume for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" from node "worker1"
I0809 08:08:15.620681       1 operation_executor.go:754] Verified volume is safe to detach for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" (spec.Name: "pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0") from node "worker1".
I0809 08:08:27.717207       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
I0809 08:08:46.623523       1 operation_executor.go:700] DetachVolume.Detach succeeded for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" (spec.Name: "pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0") from node "worker1".
I0809 08:08:46.631090       1 reconciler.go:213] Started AttachVolume for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" to node "worker4"
W0809 08:08:51.312581       1 reflector.go:319] pkg/controller/garbagecollector/garbagecollector.go:768: watch of <nil> ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [5170833/5170334]) [5171832]
I0809 08:08:58.173855       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
I0809 08:09:07.644400       1 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0.vhd" (spec.Name: "pvc-xxxx-xxxx-xxxx-xxxx-00224801aff0") from node "worker4".
I0809 08:09:28.631241       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
I0809 08:09:59.088123       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
I0809 08:10:29.545324       1 pet_set.go:332] StatefulSet wtt blocked from scaling on pod wtt-0
I0809 08:10:30.209176       1 replication_controller.go:322] Observed updated replication controller kube-dns-v19. Desired pod count change: 2->2

On the minion where pod was scheduled to run I can see

Aug 09 08:09:48 worker4 docker[1219]: I0809 08:09:48.769238    1260 reconciler.go:230] VerifyControllerAttachedVolume operation started for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0")
Aug 09 08:09:48 worker4 docker[1219]: I0809 08:09:48.772609    1260 operation_executor.go:1219] Controller successfully attached volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") devicePath: "0"
Aug 09 08:09:48 worker4 docker[1219]: I0809 08:09:48.869744    1260 reconciler.go:306] MountVolume operation started for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") to pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:48 worker4 docker[1219]: I0809 08:09:48.869784    1260 operation_executor.go:812] Entering MountVolume.WaitForAttach for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") DevicePath: "0"
Aug 09 08:09:48 worker4 kernel: ata1: soft resetting link
Aug 09 08:09:49 worker4 kernel: ata1.01: host indicates ignore ATA devices, ignored
Aug 09 08:09:49 worker4 kernel: ata1.00: host indicates ignore ATA devices, ignored
Aug 09 08:09:49 worker4 kernel: ata1: EH complete
Aug 09 08:09:49 worker4 kernel: ata2: soft resetting link
Aug 09 08:09:49 worker4 kernel: ata2: EH complete
Aug 09 08:09:50 worker4 docker[1219]: E0809 08:09:50.115762    1260 kubelet.go:1522] Unable to mount volumes for pod "wtt-0_default(d5497686-7cd9-11e7-823e-00224801aff0)": timeout expired waiting for volumes to attach/mount for pod "default"/"wtt-0". list of unattached/unmounted volumes=[hornetqd]; skipping pod
Aug 09 08:09:50 worker4 docker[1219]: E0809 08:09:50.115796    1260 pod_workers.go:184] Error syncing pod d5497686-7cd9-11e7-823e-00224801aff0, skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"wtt-0". list of unattached/unmounted volumes=[hornetqd]
Aug 09 08:09:50 worker4 docker[1219]: I0809 08:09:50.309366    1260 operation_executor.go:832] MountVolume.WaitForAttach succeeded for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:50 worker4 docker[1219]: E0809 08:09:50.315779    1260 mount_linux.go:119] Mount failed: exit status 1
Aug 09 08:09:50 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:50 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:50 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab
Aug 09 08:09:50 worker4 docker[1219]: E0809 08:09:50.318993    1260 mount_linux.go:391] Could not determine if disk "" is formatted (exit status 1)
Aug 09 08:09:50 worker4 docker[1219]: E0809 08:09:50.319815    1260 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd\"" failed. No retries permitted until 2017-08-09 08:09:50.819791624 +0000 UTC (durationBeforeRetry 500ms). Error: MountVolume.MountDevice failed for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") with: mount failed: exit status 1
Aug 09 08:09:50 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:50 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:50 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab
Aug 09 08:09:50 worker4 docker[1219]: I0809 08:09:50.834858    1260 reconciler.go:306] MountVolume operation started for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") to pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:50 worker4 docker[1219]: I0809 08:09:50.834891    1260 operation_executor.go:812] Entering MountVolume.WaitForAttach for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") DevicePath: "0"
Aug 09 08:09:50 worker4 kernel: ata1: soft resetting link
Aug 09 08:09:51 worker4 kernel: ata1.01: host indicates ignore ATA devices, ignored
Aug 09 08:09:51 worker4 kernel: ata1.00: host indicates ignore ATA devices, ignored
Aug 09 08:09:51 worker4 kernel: ata1: EH complete
Aug 09 08:09:51 worker4 kernel: ata2: soft resetting link
Aug 09 08:09:51 worker4 kernel: ata2: EH complete
Aug 09 08:09:52 worker4 docker[1219]: I0809 08:09:52.201017    1260 operation_executor.go:832] MountVolume.WaitForAttach succeeded for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:52 worker4 docker[1219]: E0809 08:09:52.206960    1260 mount_linux.go:119] Mount failed: exit status 1
Aug 09 08:09:52 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:52 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:52 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab
Aug 09 08:09:52 worker4 docker[1219]: E0809 08:09:52.209545    1260 mount_linux.go:391] Could not determine if disk "" is formatted (exit status 1)
Aug 09 08:09:52 worker4 docker[1219]: E0809 08:09:52.210134    1260 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd\"" failed. No retries permitted until 2017-08-09 08:09:53.209672543 +0000 UTC (durationBeforeRetry 1s). Error: MountVolume.MountDevice failed for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") with: mount failed: exit status 1
Aug 09 08:09:52 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:52 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:52 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab
Aug 09 08:09:53 worker4 docker[1219]: I0809 08:09:53.238544    1260 reconciler.go:306] MountVolume operation started for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") to pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:53 worker4 kernel: ata1: soft resetting link
Aug 09 08:09:53 worker4 docker[1219]: I0809 08:09:53.238615    1260 operation_executor.go:812] Entering MountVolume.WaitForAttach for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") DevicePath: "0"
Aug 09 08:09:53 worker4 kernel: ata1.01: host indicates ignore ATA devices, ignored
Aug 09 08:09:53 worker4 kernel: ata1.00: host indicates ignore ATA devices, ignored
Aug 09 08:09:53 worker4 kernel: ata1: EH complete
Aug 09 08:09:53 worker4 kernel: ata2: soft resetting link
Aug 09 08:09:53 worker4 kernel: ata2: EH complete
Aug 09 08:09:53 worker4 dockerd[967]: time="2017-08-09T08:09:53.787549532Z" level=error msg="Handler for GET /containers/5c018d19b4f5ab19e88e6180da8b74cdc4f333e2d686ff0552a504f83917c916/json returned error: No such container: 5c018d19b4f5ab19e88e6180da8b74cdc4f333e2d686ff0552a504f83917c916"
Aug 09 08:09:53 worker4 dockerd[967]: time="2017-08-09T08:09:53.788148553Z" level=error msg="Handler for GET /containers/8d121591480323f454b53277563f68501b34ce4c214116daa3d38d63f626986a/json returned error: No such container: 8d121591480323f454b53277563f68501b34ce4c214116daa3d38d63f626986a"
Aug 09 08:09:53 worker4 dockerd[967]: time="2017-08-09T08:09:53.788373359Z" level=error msg="Handler for GET /containers/2893fcc16aa3525b45c4dc5462032e651429cb23ebefcf2bee9435fb3fc1f6fe/json returned error: No such container: 2893fcc16aa3525b45c4dc5462032e651429cb23ebefcf2bee9435fb3fc1f6fe"
Aug 09 08:09:54 worker4 docker[1219]: I0809 08:09:54.605542    1260 operation_executor.go:917] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/0e09fa24-657b-11e7-823c-00224801aff0-default-token-6zzz0" (spec.Name: "default-token-6zzz0") pod "0e09fa24-657b-11e7-823c-00224801aff0" (UID: "0e09fa24-657b-11e7-823c-00224801aff0").
Aug 09 08:09:54 worker4 docker[1219]: I0809 08:09:54.680591    1260 operation_executor.go:832] MountVolume.WaitForAttach succeeded for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:54 worker4 docker[1219]: E0809 08:09:54.687657    1260 mount_linux.go:119] Mount failed: exit status 1
Aug 09 08:09:54 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:54 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:54 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab
Aug 09 08:09:54 worker4 docker[1219]: E0809 08:09:54.690631    1260 mount_linux.go:391] Could not determine if disk "" is formatted (exit status 1)
Aug 09 08:09:54 worker4 docker[1219]: E0809 08:09:54.691119    1260 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd\"" failed. No retries permitted until 2017-08-09 08:09:56.69109643 +0000 UTC (durationBeforeRetry 2s). Error: MountVolume.MountDevice failed for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") with: mount failed: exit status 1
Aug 09 08:09:54 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:54 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:54 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab
Aug 09 08:09:56 worker4 kernel: ata1: soft resetting link
Aug 09 08:09:56 worker4 docker[1219]: I0809 08:09:56.706113    1260 reconciler.go:306] MountVolume operation started for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") to pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:56 worker4 docker[1219]: I0809 08:09:56.706150    1260 operation_executor.go:812] Entering MountVolume.WaitForAttach for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") DevicePath: "0"
Aug 09 08:09:56 worker4 kernel: ata1.01: host indicates ignore ATA devices, ignored
Aug 09 08:09:56 worker4 kernel: ata1.00: host indicates ignore ATA devices, ignored
Aug 09 08:09:56 worker4 kernel: ata1: EH complete
Aug 09 08:09:56 worker4 kernel: ata2: soft resetting link
Aug 09 08:09:57 worker4 kernel: ata2: EH complete
Aug 09 08:09:58 worker4 docker[1219]: I0809 08:09:58.081775    1260 operation_executor.go:832] MountVolume.WaitForAttach succeeded for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0").
Aug 09 08:09:58 worker4 docker[1219]: E0809 08:09:58.088616    1260 mount_linux.go:119] Mount failed: exit status 1
Aug 09 08:09:58 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:58 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:58 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab
Aug 09 08:09:58 worker4 docker[1219]: E0809 08:09:58.091352    1260 mount_linux.go:391] Could not determine if disk "" is formatted (exit status 1)
Aug 09 08:09:58 worker4 docker[1219]: E0809 08:09:58.091576    1260 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd\"" failed. No retries permitted until 2017-08-09 08:10:02.09155114 +0000 UTC (durationBeforeRetry 4s). Error: MountVolume.MountDevice failed for volume "kubernetes.io/azure-disk/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd" (spec.Name: "pvc-xxx-xxx-xxx-xxx-00224801aff0") pod "d5497686-7cd9-11e7-823e-00224801aff0" (UID: "d5497686-7cd9-11e7-823e-00224801aff0") with: mount failed: exit status 1
Aug 09 08:09:58 worker4 docker[1219]: Mounting command: mount
Aug 09 08:09:58 worker4 docker[1219]: Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd ext4 [defaults]
Aug 09 08:09:58 worker4 docker[1219]: Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/epc02-dynamic-pvc-xxx-xxx-xxx-xxx-00224801aff0.vhd in /etc/fstab

rossedman commented 7 years ago

@rcconsult I don't understand how this is ACS if you are running on CoreOS? If its plain kubernetes then you are just configuring the cloud controller to use Azure? Also I think the main reason this is failing is you are using 1.5.4. I believe the PVC didn't work until 1.6.6

JackQuincy commented 7 years ago

The Azure cloud provider in upstream Kubernetes has been getting lots of improvements. We had lots of issues when it first started trying to support Azure disks. I'm not up to date on the current state though I know it has improved since 1.5.4. With 1.7.2 having support in for managed disks, though I do think there is a p0 bug on that last I heard.

rocketraman commented 7 years ago

With 1.7.2 having support in for managed disks, though I do think there is a p0 bug on that last I heard.

This? https://github.com/kubernetes/kubernetes/issues/50150

rcconsult commented 7 years ago

Hi,

Indeed we run on terraform generated cluster with pure K8S.

The last week I had the K8S cluster working OK with dynamically attached PVCs in the StatefulSet and was even able to patch PVCs into Retain policy so each time a pod restarted or was updated with a new Docker image I was re-mounting the same VHDs and kept data intact between (re)deployments.

I noticed issues yesterday when only 1st pod started OK and the second got stuck and later I could not start even the first one OK.

On 1.7.2 referenced above I had another issue found to be linked to the very recent COREOS stable version.

I solved the problem for the moment using local worker node mounts on each minion as a workaround.

rtyler commented 7 years ago

For what it's worth I appeared to have run into this issue as well with a fairly recent ACS Kubernetes deployment.

Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:57:25Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}

I ran az acs scale to add a new agent, which has started to cause disk attach errors in the controller logs.

Scaling back down (in essence deleting the VM) caused the cluster to rebalance and the problem to go away. :(

tlyng commented 7 years ago

I'm having similar issues with K8S 1.7.5 and 1.8.0 on Azure. I've tried deploying using tectonic installer and acs-engine, still encountering timeouts for mounting and stale detached disks. How can I fix this? Currently Azure does not support Kubernetes by the looks of it.

IainColledge commented 7 years ago

Just had this randomly start since last week on a cluster that has been running fine, wonder if anything has changed with blob storage.

IainColledge commented 7 years ago

To get things working again I did the following:

Looked at the pods that are causing the issue and the nodes they were on.
Stopped the node and when stopped detached any dangling disks.
Started the node.

Rinse and repeat 2&3 until all done and the cluster seems ok again.

jcharlytown commented 7 years ago

We ran into this issue on an 1.6.6 cluster as well. @IainColledge, I tried to recover the node in question like you described. However, it didn't work for me. Detaching the disk still failed with said error message, even after stopping the node. How exactly did you stop the node?

IainColledge commented 7 years ago

@jcharlytown In Azure portal, drop down into the VM control pane for the node by clicking on it.

Click on Stop.

Then click on disks and remove the non OS ones.

Click on Start and it should connect back into the cluster in about 10 mins.

Remove the "stuck" pods so they can be recreated by the Stateful Set, which I'm assuming you're using here.

jcharlytown commented 7 years ago

@IainColledge, thanks for your reply. This is exactly what I did - both via az vm and the portal. Maybe I didn't wait long enough for the VM to stop completely. I will try to be more patient next time (hopefully it doesn't happen again).

teeterc commented 6 years ago

I'm experiencing a similar error, i think:

this happened after the last Azure Scheduled Update and i'm using the managed aks service.

AttachVolume.Attach failed for volume "pvc-0db10311-f17c-11e7-9eaa-0a58ac1f0210" : Attach volume "kubernetes-dynamic-pvc-0db10311-f17c-11e7-9eaa-0a58ac1f0210" to instance "aks-nodepool1-24119515-1" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'kubernetes-dynamic-pvc-73d50496-d92b-11e7-acd2-0a58ac1f0345' to VM 'aks-nodepool1-24119515-1' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."

Interestingly, this message is showing up on several kube pods. And what i find really odd is that the requested volume id isn't the same as the error volume id.

Attach volume "kubernetes-dynamic-pvc-0db10311-f17c-11e7-9eaa-0a58ac1f0210"
Cannot attach data disk 'kubernetes-dynamic-pvc-73d50496-d92b-11e7-acd2-0a58ac1f0345'

.. pvc-73.. doesn't exist in azure or kube portal.

Any ideas on how to resolve this without rebuilding the stateful set?

theobolo commented 6 years ago

@teeterc It's exactly the same for me since yesterday....

https://github.com/Azure/acs-engine/issues/2002

Only diffirence is that my disks are up on Azure portal and marked as "Unattached" but still, it can't be mounted on the worker.

andyzhangx commented 6 years ago

@teeterc could you run kubectl getpvc and kubectl describe pvc PVC-NAME to get the status of PVC first?

andyzhangx commented 6 years ago

@teeterc for your case, restart controller-manager in master may resolve your issue

ajhewett commented 6 years ago

@teeterc I'm also seeing the same errors with PVC/PV and Statefulsets on multiple Kubernetes clusters since the maintenance reboots. My clusters were created using acs-engine with Kubernetes v1.7.5.

AttachVolume.Attach failed for volume "pvc-f9ae8b26-f21a-11e7-8424-000d3a2aee28" : Attach volume "stage-eu-west-es5-kube-dynamic-pvc-f9ae8b26-f21a-11e7-8424-000d3a2aee28" to instance "k8s-agent-82840771-0" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'stage-eu-west-es5-kube-dynamic-pvc-928b8f62-ba15-11e7-b2a4-000d3a2aee28' to VM 'k8s-agent-82840771-0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."

also with the mismatch between requested and error volume id.

The suggestion from @andyzhangx to restart the controller-manager has not helped. Nor has redeploying the k8s-master VMs or the affected k8s-agent VMs.

rcconsult commented 6 years ago

Hi, I have tested two options:

List the PVCs and PVs stuck after the reboots and re-create them, which means loosing existing data, because you may also need to destroy VHDs in storage account as well.
In Azure portal stop your affected VM, edit disks and remove all stuck PVCs mounts, start VM and delete/refresh affected pods stuck in ContainerCrating stage, your data should stay preserved if your PV policy was set to "Retain".

Good luck

Radovan

theobolo commented 6 years ago

@ajhewett @teeterc You can follow the debugging on the other issue https://github.com/Azure/acs-engine/issues/2002 and compare to your own errors.

But it seems that something went wrong yesterday.

@rcconsult It's a way if your disks are already mounted, mine are brand new so basically i can't create mount new disks on workers even if they weren't mounted anywhere previously.

There is definitely a problem, mounting Managed and Non-Managed disks since yesterday.

It's really critical for production clusters guys... Yesterday was a crazy day ...

ajhewett commented 6 years ago

@rcconsult I have performed the same as your option 1 which has worked in some clusters (fortunately, the data in my volumes is replicated). However, I have a cluster where the problem is very persistent and k8s keeps trying to attach a deleted volume.

andyzhangx commented 6 years ago

@ajhewett could you check the VM status of k8s-agent-82840771-0 and other VMs in azure portal, is it in failed status? Update: If it's in failed state, use this solution to fix it first.

ajhewett commented 6 years ago

@andyzhangx the k8s agents and master are not now in a failed state. Some were in a failed state right after the maintenance reboots but I performed a "Redeploy" from the portal on the failed VMs several hours ago. The errors attaching volumes have been occuring before and after my redeploy.

andyzhangx commented 6 years ago

@ajhewett could you check the disk stage-eu-west-es5-kube-dynamic-pvc-928b8f62-ba15-11e7-b2a4-000d3a2aee28 status, is it attached to a VM? You may detach that disk manually and wait to check. Also you could get more debugging info from https://github.com/Azure/acs-engine/issues/2002 Update: also could you provide k8s-agent-82840771-0 VM status from https://resources.azure.com ?

ajhewett commented 6 years ago

@andyzhangx the disk stage-eu-west-es5-kube-dynamic-pvc-928b8f62-ba15-11e7-b2a4-000d3a2aee28 does not exist. I had deleted it (+ PVC & PV) after redeploying the k8s-agent VM did not help. Similar to option 1 from @rcconsult which fixed a problems in other clusters after the maintenance. Disk stage-eu-west-es5-kube-dynamic-pvc-f9ae8b26-f21a-11e7-8424-000d3a2aee28 does exist and is not attached to any VM. This is the disk that should be mounted.

ajhewett commented 6 years ago

@andyzhangx the VM status is:

{
  "properties": {
    "vmId": "ca92b311-2219-4bb9-8d2a-1911ec7888d6",
    "availabilitySet": {
      "id": "/subscriptions/REDACTED/resourceGroups/stage-eu-west-es5-kube/providers/Microsoft.Compute/availabilitySets/AGENT-AVAILABILITYSET-82840771"
    },
    "hardwareProfile": {
      "vmSize": "Standard_DS12_v2"
    },
    "storageProfile": {
      "imageReference": {
        "publisher": "Canonical",
        "offer": "UbuntuServer",
        "sku": "16.04-LTS",
        "version": "16.04.201706191"
      },
      "osDisk": {
        "osType": "Linux",
        "name": "k8s-agent-82840771-0_OsDisk_1_955d458bd14341a29c286e26ad08e8fd",
        "createOption": "FromImage",
        "caching": "ReadWrite",
        "managedDisk": {
          "storageAccountType": "Premium_LRS",
          "id": "/subscriptions/REDACTED/resourceGroups/STAGE-EU-WEST-ES5-KUBE/providers/Microsoft.Compute/disks/k8s-agent-82840771-0_OsDisk_1_955d458bd14341a29c286e26ad08e8fd"
        },
        "diskSizeGB": 128
      },
      "dataDisks": []
    },
    "osProfile": {
      "computerName": "k8s-agent-82840771-0",
      "adminUsername": "azureuser",
      "linuxConfiguration": {
        "disablePasswordAuthentication": true,
        "ssh": {
          "publicKeys": [
            {
              "path": "/home/azureuser/.ssh/authorized_keys",
              "keyData": "REDACTED"
            }
          ]
        }
      },
      "secrets": []
    },
    "networkProfile": {
      "networkInterfaces": [
        {
          "id": "/subscriptions/REDACTED/resourceGroups/stage-eu-west-es5-kube/providers/Microsoft.Network/networkInterfaces/k8s-agent-82840771-nic-0"
        }
      ]
    },
    "provisioningState": "Succeeded"
  },
  "resources": [
    {
      "properties": {
        "publisher": "Microsoft.Azure.Extensions",
        "type": "CustomScript",
        "typeHandlerVersion": "2.0",
        "autoUpgradeMinorVersion": true,
        "settings": {},
        "provisioningState": "Succeeded"
      },
      "type": "Microsoft.Compute/virtualMachines/extensions",
      "location": "westeurope",
      "id": "/subscriptions/REDACTED/resourceGroups/stage-eu-west-es5-kube/providers/Microsoft.Compute/virtualMachines/k8s-agent-82840771-0/extensions/cse0",
      "name": "cse0"
    },
    {
      "properties": {
        "publisher": "Microsoft.EnterpriseCloud.Monitoring",
        "type": "OmsAgentForLinux",
        "typeHandlerVersion": "1.0",
        "autoUpgradeMinorVersion": true,
        "settings": {
          "workspaceId": "c6e477f5-722a-4cd9-8953-635f21709e94",
          "azureResourceId": "/subscriptions/REDACTED/resourcegroups/stage-eu-west-es5-kube/providers/microsoft.compute/virtualmachines/k8s-agent-82840771-0",
          "stopOnMultipleConnections": true
        },
        "provisioningState": "Succeeded"
      },
      "type": "Microsoft.Compute/virtualMachines/extensions",
      "location": "westeurope",
      "id": "/subscriptions/REDACTED/resourceGroups/stage-eu-west-es5-kube/providers/Microsoft.Compute/virtualMachines/k8s-agent-82840771-0/extensions/OmsAgentForLinux",
      "name": "OmsAgentForLinux"
    }
  ],
  "type": "Microsoft.Compute/virtualMachines",
  "location": "westeurope",
  "tags": {
    "creationSource": "acsengine-k8s-agent-82840771-0",
    "orchestrator": "Kubernetes:1.7.5",
    "poolName": "agent",
    "resourceNameSuffix": "82840771"
  },
  "id": "/subscriptions/REDACTED/resourceGroups/stage-eu-west-es5-kube/providers/Microsoft.Compute/virtualMachines/k8s-agent-82840771-0",
  "name": "k8s-agent-82840771-0"
}

rocketraman commented 6 years ago

Same here... a 1.7.5 cluster with all of the stateful sets borked after the security updates. 3 of 6 VMs are stuck in "Failed" state (though active nodes in k8s), and the other 3 are stuck in "VM Stopping" state. Attempting the instructions @ https://blogs.technet.microsoft.com/mckittrick/azure-vm-stuck-in-failed-state-arm/ for the Failed VMs just results in a 409 conflict error saying that the disk cannot detached because it is in use. Attempting it on the "VM Stopping" state VMs just freezes the powershell script indefinitely.

It's kind of crazy and scary how poorly stateful sets on Azure work. Its doubly annoying because when everything is fine they work well -- they just can't survive any sort of unusual / exceptional situations.

IainColledge commented 6 years ago

Have now got multiple clusters with multiple containers with problems due to nodes in failed states and disk attach errors, have got to say this really needs looking at to make it robust as is very fragile.

andyzhangx commented 6 years ago

@ajhewett thanks for providing the info. Could you use https://blogs.technet.microsoft.com/mckittrick/azure-vm-stuck-in-failed-state-arm/ to update VM k8s-agent-82840771-0 even if's in healthy status? And let me know the Powershell script result. This could be a VM level issue, thanks.

ajhewett commented 6 years ago

@andyzhangx many, many thanks!

I executed

PS C:\> Get-AzureRmVM -Name "k8s-agent-82840771-0" -ResourceGroupName 'stage-eu-west-es5-kube' | Update-AzureRmVM

RequestId IsSuccessStatusCode StatusCode ReasonPhrase
--------- ------------------- ---------- ------------
                         True         OK OK

and without any futher manual intervention the pod that was stuck in ContainerCreating sucessfully mounted the correct volume ("pvc-f9ae8b26-f21a-11e7-8424-000d3a2aee28") and is now Running.

andyzhangx commented 6 years ago

@ajhewett good to know. One thing to confirm, some of your VMs are in Failed state, and then you

use Redeploy in azure portal which could bring VM state to Running.
And then use https://blogs.technet.microsoft.com/mckittrick/azure-vm-stuck-in-failed-state-arm/ to update agent VM which has the issue. These two steps are the workaround for this issue? right?

ajhewett commented 6 years ago

@andyzhangx

Several VMs rebooted without any problems. Step 1 helped repair some failed VMs but not all. Step 2 was additionally needed for some VMs even though they seemed healthy.

However, I cannot be sure that only these 2 steps are sufficient because I tried several other things before step 2. e.g. reducing the number of replicas in the statefulsets, deleting PVCs, deleting the stuck pods so that the statefulset recreated them.

BTW: I still have 7 k8s clusters with VMs that have not (yet) been rebooted. If a similar problem occurs I will first try step 1 and 2 before anything else.

andyzhangx commented 6 years ago

@ajhewett thanks for the info. Another side quesiton: are all your statefulsets using azure disk, since azure disk only supports RWO which means if replica >=2 and the two pods are in two different agent nodes, they could not use same azure disk(PVC)

ajhewett commented 6 years ago

@andyzhangx the statefulsets use dynamically provisioned volumes (PVCs) with managed azure disks (storageclass managed-premium or managed-standard). Each pod gets its own disk, there is no disk sharing. The workload using statefulsets and PVCs is Elasticsearch.

rocketraman commented 6 years ago

@andyzhangx I've tried your workaround @ https://github.com/Azure/ACS/issues/12#issuecomment-355742312 for my situation. However, it was not sufficient. The fundamental problem seems to be a disconnect between the Kubernetes view of each node (node fully up and in "Running" state) and the Azure view of each node VM (VM in "VM Stopping" or in "Failed" state), which never resolves because of the persistent disk leases not being cleanly released and re-acquired.

The final solution that worked for me was to simply do the following for every agent node in my cluster one at a time:

1) kubectl cordon <node> 2) delete any pods on node with stateful sets 3) kubectl drain <node> 4) restart the Azure VM for node via the API or portal, wait untli VM is "Running" 5) kubectl uncordon <node>

While doing this, the newly restarted nodes may still fail to mount persistent disks as one of the subsequent un-restarted VMs may still be holding a lock on a disk. This resolves itself once all the VMs are restarted and are in "Running" state.

teeterc commented 6 years ago

I did find a workaround (sorry i didn't post earlier). I noticed that the failing pods were being deployed the failed Azure nodes. So, i:

1: Scaled the the kube cluster + sum(failed nodes) 2: Deleted the failed nodes in Azure Portal (this removed any disk mounting issues when the pods came up on other cluster nodes.)

Everything came back online in about 10m. It seems that the Volumes that are failing to mount are isolated to the failing pods.... still strange, however, that Pods were trying to mount unrelated stateful volumes.

andyzhangx commented 6 years ago

Update this thread: Recently I fixed a race condition issue which could cause attach disk error, the fix has been merged into v1.10, I am trying to cherry-pick this fix to other k8s version, you could find details about this issue here: https://github.com/andyzhangx/demo/blob/master/issues/azuredisk-issues.md#1-disk-attach-error

1. disk attach error

Issue details:

In some corner case(detaching multiple disks on a node simultaneously), when scheduling a pod with azure disk mount from one node to another, there could be lots of disk attach error(no recovery) due to the disk not being released in time from the previous node. This issue is due to lack of lock before DetachDisk operation, actually there should be a central lock for both AttachDisk and DetachDisk opertions, only one AttachDisk or DetachDisk operation is allowed at one time.

The disk attach error could be like following:

Cannot attach data disk 'cdb-dynamic-pvc-92972088-11b9-11e8-888f-000d3a018174' to VM 'kn-edge-0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again.

Related issues

Mitigation:

option#1: Update every agent node that has attached or detached the disk in problem

in Azure cloud shell, run

$vm = Get-AzureRMVM -ResourceGroupName $rg -Name $vmname  
Update-AzureRmVM -ResourceGroupName $rg -VM $vm -verbose -debug

in Azure cli, run

az vm update -g <group> -n <name>

option#2: 1) kubectl cordon node #make sure no scheduling on this node 2) kubectl drain node #schedule pod in current node to other node 3) restart the Azure VM for node via the API or portal, wait untli VM is "Running" 4) kubectl uncordon node

Fix

PR fix race condition issue when detaching azure disk has fixed this issue by add a lock before DetachDisk

k8s version	fixed version
v1.6	no fix since v1.6 does not accept any cherry-pick
v1.7	1.7.14
v1.8	1.8.9
v1.9	1.9.5
v1.10	1.10.0

rocketraman commented 6 years ago

@andyzhangx I also note that most of my nodes, even though there are no visible issues with stateful sets right now, are getting errors like this every second or so in my logs:

Mar 03 08:56:50 k8s-agentpool1-18117938-0 docker[460]: E0303 08:56:50.123499     493 kubelet_volumes.go:128] Orphaned pod "0f2fbe5f-0004-11e8-8438-000d3af4357e" found, but volume paths are still present on disk. : There were a total of 3 errors similar to this.  Turn up verbosity to see them.

Investigating that pod:

root@k8s-agentpool1-18117938-0:~# ls -lR /var/lib/kubelet/pods/0f2fbe5f-0004-11e8-8438-000d3af4357e/volumes/
/var/lib/kubelet/pods/0f2fbe5f-0004-11e8-8438-000d3af4357e/volumes/:
total 8
drwxr-x--- 3 root root 4096 Jan 23 06:14 kubernetes.io~azure-disk
drwxr-xr-x 2 root root 4096 Jan 23 06:33 kubernetes.io~secret

/var/lib/kubelet/pods/0f2fbe5f-0004-11e8-8438-000d3af4357e/volumes/kubernetes.io~azure-disk:
total 4
drwxr-x--- 2 root root 4096 Jan 23 06:14 pvc-768de031-9e81-11e7-a717-000d3af4357e

/var/lib/kubelet/pods/0f2fbe5f-0004-11e8-8438-000d3af4357e/volumes/kubernetes.io~azure-disk/pvc-768de031-9e81-11e7-a717-000d3af4357e:
total 0

/var/lib/kubelet/pods/0f2fbe5f-0004-11e8-8438-000d3af4357e/volumes/kubernetes.io~secret:
total 0

root@k8s-agentpool1-18117938-0:~# mount | grep 0f2fbe5f-0004-11e8-8438-000d3af4357e
root@k8s-agentpool1-18117938-0:~#

Note, that pod is not running on this machine any more. So it seems like somewhere along the way these volume directories did not get cleaned up properly.

andyzhangx commented 6 years ago

@rocketraman this could be related to pod deletion error due to race condition, if error only exists in kubelet logs, that would be ok according to my experience.

rocketraman commented 6 years ago

@rocketraman this could be related to pod deletion error due to race condition, if error only exists in kubelet logs, that would be ok according to my experience.

So after your fix is deployed, this should not occur again right? The log messages themselves seem easy to "solve" -- it appears that removing the orphaned pod directory with rm -fr /var/lib/kubelet/pods/0f2fbe5f-0004-11e8-8438-000d3af4357e does work. I'm not quite sure if there are any negatives to this but everything in that directory is zero-length files or empty directories anyway, so I suspect none.

andyzhangx commented 6 years ago

The original issue is fixed by https://github.com/kubernetes/kubernetes/pull/60183 you could find details here: https://github.com/andyzhangx/demo/blob/master/issues/azuredisk-issues.md#1-disk-attach-error

Let me know if you have any question, thanks.

Azure / ACS