Closed lingtham closed 5 years ago
Hi @lingtham - Thank you for your feedback! We will review and update as appropriate.
@lingtham Thanks for reporting. I am able to reproduce your issue. i will update you as soon as i have some information to share
Initially i thought its because of the nginx image. I used mcr.microsoft.com/windows/servercore:ltsc2019
. Same issue with this windows image as well
I am using image mcr.microsoft.com/dotnet/framework/aspnet:4.8-20190515-windowsservercore-ltsc2019
And thanks, am looking forward to your update
Hi @jakaruna-MSFT, any update? Am still having this problem
Please use dynamic disk provisioning as the workaround till this issue is resolved. I checked with dynamic disk provisioning and thats working fine https://docs.microsoft.com/en-us/azure/aks/azure-disks-dynamic-pv
Once the PV is created, Thats reclaim policy will be set to DELETE by default. So if you delete the PVC, Then this PV will also be deleted.
Yes, the dynamic disks did work, I was able to attach them. Thank you!
I'm running into another issues with pod termination that I would like some help on, please let me know where to make a new issue if needed.
I deallocated one of my nodes to see what would happen in case of VM failure. Kubernetes automatically recreates the pods on a working node but my pods with the disks attached have been stuck in a terminating state for 18 hours. I read that pods with volumes attached take longer to terminate but doesn't 18 hours seem weirdly long? I found some posts that suggested using kubectl delete pod <pod-id> --grace-period=0 --force
to force delete it but it doesn't actually delete the running resource, I still see the containers running in Azure Portal.
It also doesn't explain why this is happening. Using kubectl describe pod
also doesn't give me any insights. I have also tried kubectl logs <pod id>
on a terminating pod but since the node is deallocated, it gives a connect: no route to host
error. Any ideas why these pods are stuck in a terminating state? Would this happen during actual VM failure too?
Edit: I restarted the node up and the old pods disappeared and the new pods started running (still on the other node).
You can create a thread in Azure Container Services MSDN forum and describe you issue there.
After deallocation, What happened to the disk which was attached to that instance. If VM is unresponsive, Then AKS should detach the disk from the VM and attach it tot he new VM and deploy the pod in new VM.
In this case, Please check whats happened to the disk which was attached to the deallocated VM.
For reproducing the issue, let me know you used static disk or dynamic disk.
@lingtham for the static disk issue, can you try adding the fsType to NTFS in the windows pod yaml and try.
volumes:
- name: azure
azureDisk:
kind: Managed
diskName: myAKSDisk
diskURI: /subscriptions/<subscriptionID>/resourceGroups/MC_myAKSCluster_myAKSCluster_eastus/providers/Microsoft.Compute/disks/myAKSDisk
fsType: NTFS
I made a question here: https://social.msdn.microsoft.com/Forums/en-US/3cec387f-b345-412c-877b-aa33fe6508d2/pods-stuck-in-terminating-status-after-deallocating-node-prevents-disks-from-attaching-to-new-pods?forum=AzureContainerServices
I used dynamic disks and the disks could not attach to the new VM because it was still attached to the old VM since the pods were stuck in a terminating state.
For the static disk issue, adding fsType: NTFS
did solve the issue, thank you! Maybe you can add this to doc?
@lingtham Thanks. I will reply to the MSDN thread. I will also fix the Doc.
@zr-msft @mlearned Can we update this fsType: NTFS
in the node yaml or in a note.
With this setting only, static disks will work in windows nodes.
In the same page, Under creating a new disk, setting --os-type option to Windows for windows disk is added. But thats not helping for windows node pools.
@lingtham i have replied to your msdn thread. Please provide the required details(specified in the msdn thread) so that i can enable a one time free support ticket for you.
@lingtham we will continue the discussion offline. Please follow up with @jakaruna-MSFT in the MSDn thread.
Hi,
I am using Azure Kubernetes Services with Windows Server containers (in Preview - version 1.14.0). I am following this doc to create an AzureDisk to attach to a pod.
The problem I'm having is that when I try to deploy my app, I get a MountVolume failed error: MountVolume.MountDevice failed for volume "reducer1win" : azureDisk - mountDevice:FormatAndMount failed with diskMount: format disk failed, error: exit status 1, output: "Format-Volume : Cannot validate argument on parameter 'FileSystem'. The argument \"ext4\" does not belong to the set \r\n\"FAT,FAT32,exFAT,NTFS,ReFS\" specified by the ValidateSet attribute. Supply an argument that is in the set and then try \r\nthe command again.\r\nAt line:1 char:180\r\n+ ... nDriveLetter -UseMaximumSize | Format-Volume -FileSystem ext4 -Confir ...\r\n+ ~~~~\r\n + CategoryInfo : InvalidData: (:) [Format-Volume], ParameterBindingValidationException\r\n + FullyQualifiedErrorId : ParameterArgumentValidationError,Format-Volume\r\n \r\n" and clean up failed with :remove \var\lib\kubelet\plugins\kubernetes.io\azure-disk\mounts\m1971362558: The system cannot find the file specified. Warning FailedMount 103s (x2 over 4m) kubelet, aksnpwin000002 Unable to mount volumes for pod "reducer1-649687f9b7-wl2zq_default(2c6f81c8-9df3-11e9-a14d-b62be1e7749a)": timeout expired waiting for volumes to attach or mount for pod "default"/"reducer1-649687f9b7-wl2zq". list of unmounted volumes=[reducer1win]. list of unattached volumes=[reducer1win default-token-ffjdt]
I bolded the part of the error that I think is the issue. Ext4 is a filesystem for Linux and I'm running windows containers BUT I did use the --os-type windows parameter when creating my disks. Any ideas on how to fix this?
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.