Closed JasonKAls closed 5 years ago
Thanks for reporting this issue @JasonKAls!
From the logs you have shared, it looks like the flexvolume driver was not able mount the volume as pod identity could not get a successful auth token for key vault failed to get service principal token: nmi response failed with status code: 403
. This error The request content has the following duplicate identity ids
suggests the MIC is not able to assign the same identity to the same VM. https://github.com/Azure/aad-pod-identity/issues/167
Can you please share the entire pod/deployment yaml with the pod identity label aadpodidbinding: k8s-secrets
and the flexvolume definition?
And please share outputs for the following commands so that we can see all the resources using pod identity:
kubectl get azureidentity
kubectl get azureidentitybinding
kubectl get azureassignedidentity
kubectl get pod -o wide
Another thing is please make sure you are using mcr.microsoft.com/k8s/aad-pod-identity/nmi:1.4
as the above image has no tag.
cc @kkmsft @aramase
Thanks for responding, @ritazh!
I'll provide the other details, but the azure get commands for kubectl have never produced anything for me. Even for the environments and pods where FlexVols are working. Any suggestions?
Hi @JasonKAls - the issue https://github.com/Azure/aad-pod-identity/issues/167 was reported in the context that there are already previously assigned identities on the node. Is that the case here - do you have some user assigned identities already assigned on the node for some other operation ?
Hello @kkmsft!
The nodes involved are from an AKS agentpool and only have 1 Managed Identity assigned to it.
@JasonKAls Can you pls verify what is the mechanism that assigned that one managed identity on that agent node? was it pod identity? did you manually assign the identity to that node? Thanks!
by using az identity create
@JasonKAls Thank you for the response. I've a few more questions -
az vm identity assign
?ResourceID
in the azureidentity during multiple retry attempts?az vm identity show --resource-group <rg> --name <vm name>
? You can remove the subid from the output and post it here.Hello @aramase!
Before I answer your questions, you should know @ritazh's suggestion to create Managed IDs per deployment seems to be working accept for one.
az identity create
. I then need to ask another department to assign the correct permissions to it. az vm identity assign
is denied for me.Thanks again!
@JasonKAIs Can you send it to me over email - anramase@microsoft.com. Please redact the client id/principal id and sensitive information from the output. I just need to verify the resource ID format as it appears in the output.
Done! For future reference for this ticket, JSON output should look like:
"principalId": null,
"tenantId": null,
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/************************/resourceGroups/*****************/providers/Microsoft.ManagedIdentity/userAssignedIdentities/prod********-nginx": {
"clientId": "*******************",
"principalId": "****************************"
},
"/subscriptions/**************************/resourceGroups/********************/providers/Microsoft.ManagedIdentity/userAssignedIdentities/prod***********-api": {
"clientId": "*************************",
"principalId": "**********************************"
},
"/subscriptions************************/resourceGroups/*************************/providers/Microsoft.ManagedIdentity/userAssignedIdentities/prod-*****************-web": {
"clientId": "*********************************",
"principalId": "***************************************"
}
}
}
to show each ID attached to a VM. I have 3 on this one.
@JasonKAls Thank you for posting the output and also for sending it to me through email. I was able to recreate this issue on my cluster.
The issue was caused because the check to see if an id exists on the node was not case insensitive. The resourceID defined in the AzureIdentity
you provided is ResourceID: /subscriptions/****************/resourcegroups/************/providers/Microsoft.ManagedIdentity/userAssignedIdentities/**************
and the identity that already existed on the node had the format /subscriptions/************************/resourceGroups/*****************/providers/Microsoft.ManagedIdentity/userAssignedIdentities/prod********
. The difference is in the case for the string resourceGroups
on the node and resourcegroups
in the AzureIdentity
definition. The check should have been case insensitive.
The fix for that has been merged (https://github.com/Azure/aad-pod-identity/pull/271) and will be included as part of next release.
Hello @aramase!
That's fantastic news! Thanks for your teams diligent attention to this issue and for providing a temporary solution for me and my team. I look forward to the next release!
@JasonKAls 1.5-rc2 is now available. Please try it out and provide us with valuable feedback towards 1.5 release - https://github.com/Azure/aad-pod-identity/releases/tag/1.5-rc2
Closing this issue now. Please reopen if you have any further issues.
cc @ritazh
Hello,
I've been working with MS support engineers for awhile now trying to get FlexVolumes mounted to several pods to integrate KeyVault and AKS (my K8s cluster). I've run into several problems that have caused confusion in understanding the underlining problems and their solutions. One of the biggest issues is how after several hours of waiting for FlexVols to mount it seems to spontaneously work. However, it'll only work for some pods and not others even when the configuration is basically the same for all of them. Below is a detailed description of the errors I've received and tried to solve and how I'm currently trying to use this setup:
Steps To Reproduce
Flexvolume Setup:
Pod Identity Setup:
Pod Volumes Setup:
Errors Received:
From Virtual Machines in AKS agentpool:
From MC Resource Group:
From Pods after 30min:
From Pods after 1hour:
Expected behavior Pods should successfully mount FlexVolume and have needed KeyVault Secrets mounted.
Key Vault FlexVolume version image: "mcr.microsoft.com/k8s/flexvolume/keyvault-flexvolume:v0.0.10
Access mode: service principal or pod identity I'm using Pod Identity
Kubernetes version
Additional context Please let me know if I can provide more information.