[BUG]The Azure Disk with SKU type Premium SSD v2 LRS remains in the 'attached' state even after the pod has been scaled down.

Saurabh12p commented 1 year ago

Issue Summary: We have encountered an issue with the attachment and detachment behavior of Premium SSD v2 (PremiumV2_LRS) disks in Azure Kubernetes Service (AKS) that deviates from our expectations and Microsoft's documentation.

Scenario A: Premium SSD LRS Disks (Expected Behavior):

Deployed disks with SKU Premium SSD LRS and attached them to the Nginx pod. Scaling up the Nginx pod correctly resulted in the disk attachment. Scaling down the Nginx deployment correctly detached the disk.

Scenario B: Premium SSD v2 LRS/PremiumV2_LRS Disks (Issue):

We transitioned to use the SKU Premium SSD v2 LRS/PremiumV2_LRS disks. Initially, the Nginx pod correctly attached the disk. However, after scaling down the Nginx pod to zero replicas, the disk remained attached, contrary to our expectations and Microsoft's documentation. Additional Information: We followed Microsoft's documentation (https://learn.microsoft.com/en-us/azure/aks/use-premium-v2-disks), which recommended using the "premium2-disk-sc" storage class (referred to as a driver) for Premium SSD v2 LRS disks/PremiumV2_LRS. Previously, we were using the managed-csi driver. Despite transitioning to the "premium2-disk-sc" driver, we encountered the same attachment/detachment issue. To mitigate this problem temporarily, we've resorted to manually restarting the node pool each time to correctly attach the Premium SSD v2 disks, which is not the intended behavior. Action Taken: We have raised a support ticket with Microsoft (#2308290030002327) to conduct a thorough investigation into this issue. We have yet to determine whether this problem is solely due to the driver or if other factors contribute to it. Configuration Details: For both Premium SSD v1 and v2, we are using CSI drivers. In Scenario 1 (Premium SSD LRS), we used the storage class "managed-csi" representing Premium LRS. In Scenario 2 (Premium SSD v2 LRS), we followed Microsoft's recommendation and used the storage class "premium2-disk-sc" for PremiumV2_LRS disks. Expected Behavior: When scaling down the Nginx pod replica count to zero, Premium SSD v2 (PremiumV2_LRS) disks should automatically detach, aligning with Microsoft's documentation. Current Behavior: Premium SSD v2 (PremiumV2_LRS) disks remain attached after scaling down the Nginx pod, necessitating manual node pool restarts. This issue has substantial implications for the reliability and manageability of our AKS clusters and can introduce unwarranted operational overhead. We earnestly request that this matter receives attention to ensure Premium SSD v2 disks behave as anticipated when integrated into AKS clusters. Environment Details: Azure Kubernetes Service (AKS) Azure managed CSI driver Premium SSD v2 (PremiumV2_LRS) disks Kubernetes version [1.26.3] and [1.26.6] To Reproduce the Issue: Deploy an AKS cluster with Managed Identity (MI) enabled. Deploy disks with Premium SSD v1 and v2. Create two Nginx deployment files, one attached to SSDv2 and the other to SSDv1 disks. Apply the files to the cluster. Scale down and then scale up the pods; you will encounter the issue with the pod attached to SSDv2. PFA for configuration files

github.zip

Saurabh12p commented 1 year ago

@andyzhangx @miwithro @dpaardenkooper

Saurabh12p commented 1 year ago

@andyzhangx We attempted to update the AKS cluster using the following command: az resource update --name --namespace Microsoft.ContainerService --resource-group --resource-type ManagedClusters --subscription . However, the issue remains the same.

kaarthis commented 1 year ago

Need follow up on the above.

andyzhangx commented 1 year ago

I cannot find your cluster info now, seems it's deleted? And from the history logs, it seems you have attached premium v2 disk to a non-zone node, and azure disk driver has trouble detaching that disk. Can you creating a zonal node pool and try premium v2 disk?

I0831 12:37:25.782271       1 azure_controller_vmss.go:239] azureDisk - update(mc_in-aks-sourabh-rg_in-aks-in-aks-test-cluster-cluster_eastus): vm(aks-dbpool-32364004-vmss000002) - detach disk list(map[/subscriptions/xxx/resourcegroups/in-aks-sourabh-rg/providers/microsoft.compute/disks/in-orientdb-disk-new:in-orientdb-disk-new])

I0831 12:37:25.846893       1 azure_armclient.go:301] Received error in sendAsync.send: resourceID: http://localhost:7788/subscriptions/xxx/resourceGroups/mc_in-aks-sourabh-rg_in-aks-in-aks-test-cluster-cluster_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-dbpool-32364004-vmss/virtualMachines/2?api-version=2022-03-01, error: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 400, RawError: {

  "error": {

    "code": "InvalidParameter",

    "message": "Managed disks with 'PremiumV2_LRS' storage account type can be used only with Virtual Machines in an Availability Zone.",

    "target": "managedDisk.storageAccountType"

  }

Saurabh12p commented 1 year ago

@andyzhangx Thanks for the update. When I created a disk of type Premium SSD v2 LRS in Zone 1 and a node pool in Zone 1, it worked as expected. Is there any ongoing work by Microsoft to make it work when both are not in any zone?

Aaron-ML commented 11 months ago

@andyzhangx We are noticing this on our cluster as well, and realized when we delete a premiumSSDv2 pvc in kubernetes it doesn't clean up the disk in Azure.

So if we go and recreate the PVC it will attempt to reference the same disk and error out.

Edit: the disk does eventually get cleaned up, however even with a new disk I see the same error.. but only for 1 of 3 pods created on the same nodepool.

  Warning  FailedAttachVolume  6s (x6 over 24s)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-7abd0d53-b06f-43c6-8f5f-83f1a2ef938b" : rpc error: code = Internal desc = Attach volume /subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Compute/disks/pvc-7abd0d53-b06f-43c6-8f5f-83f1a2ef938b to instance aks-node<redacted>-vmss00001f failed with Retriable: false, RetryAfter: 0s, HTTPStatusCode: 400, RawError: {\r
  "error": {\r
    "code": "InvalidParameter",\r
    "message": "Managed disks with 'PremiumV2_LRS' storage account type can be used only with Virtual Machines in an Availability Zone.",\r
    "target": "managedDisk.storageAccountType"\r

I'm confused on the limitations for this disk. If it's truly only supported for AZ aware nodepools, why is it allowed only sometimes?

Looks like the disks created by AKS are somehow created without an AZ selected:

Where as I cannot do that by manually creating:

Not sure if related but we also see logs about these disks failling to be deleted

default         2m34s       Warning   VolumeFailedDelete        persistentvolume/pvc-7abd0d53-b06f-43c6-8f5f-83f1a2ef938b                     persistentvolume pvc-7abd0d53-b06f-43c6-8f5f-83f1a2ef938b is still attached to node aks-spotd4asv5-22164825-vmss00001f
default         20s         Warning   VolumeFailedDelete        persistentvolume/pvc-d0df0ba8-a348-48e2-88e4-e1f582963a4f                     persistentvolume pvc-d0df0ba8-a348-48e2-88e4-e1f582963a4f is still attached to node aks-spotd4sv4-13804330-vmss000003
default         19s         Warning   VolumeFailedDelete        persistentvolume/pvc-404dc8fa-658b-4c46-a552-4e1bf52f0e9a                     persistentvolume pvc-404dc8fa-658b-4c46-a552-4e1bf52f0e9a is still attached to node aks-spotd4sv4-13804330-vmss000017

sjuls commented 11 months ago

Hi @andyzhangx , we're also hitting this issue. One more troubling affect of this bug is that the VM which has the PremiumSSDv2 disk attached is unable to attach other disks, e.g. Standard SSDs.

Activity logs for the VM are continuously failing Screenshot 2023-11-14 at 13 12 53

The error message in the activities suggest PremiumSSDv2 cannot be used, but note that the PremiumSSDv2 disk is already attached to the VM, and was not detached when the PVC was deleted.

{
  "error": {
    "code": "InvalidParameter",
    "target": "managedDisk.storageAccountType",
    "message": "Managed disks with 'PremiumV2_LRS' storage account type can be used only with Virtual Machines in an Availability Zone."
  }
}

Tags on the Disk show it was created by kubernetes-azure-dd Screenshot 2023-11-14 at 12 19 06

HummingMind commented 9 months ago

I am also running into this. My AKS cluster is not set up to use Availability Zones. But it let dynamically provisiton a V2 SSD anyway using a custom storage class. It would be nice if it failed the fisrst time, so we can find the correct documentation right away.

laukik85 commented 8 months ago

I am using latest available AKS version (1.28.3). I am facing same issue. Is there any work around for this issue?

HannoSolo commented 8 months ago

I am using latest available AKS version (1.28.3). I am facing same issue. Is there any work around for this issue?

I found a work around which is to just manually detach the disk from the node that it was last successfully attached to, but if you want to solve it permanently what worked for me was to deploy a system and user nodepool that use availabiltiy zones, then delete and recreate the PVC.

I also changed my Storage Class to use "volumeBindingMode: WaitForFirstConsumer" as this is needed when you have nodes that are in different AVZones, the default is "Immediate", ref - https://learn.microsoft.com/en-us/azure/aks/availability-zones#azure-disk-availability-zone-support

lucasmaj commented 7 months ago

@andyzhangx Thanks for the update. When I created a disk of type Premium SSD v2 LRS in Zone 1 and a node pool in Zone 1, it worked as expected. Is there any ongoing work by Microsoft to make it work when both are not in any zone?

Keeping both VM and SSD V2 in the the same zone solved my problems.

Azure / AKS

[BUG]The Azure Disk with SKU type Premium SSD v2 LRS remains in the 'attached' state even after the pod has been scaled down. #3880