Closed stevenicholl closed 5 years ago
Yet
kubectl exec -it nginx-flex-kv-podid cat /kv/password
successfully returns the correct value of the secret referenced.
@stevenicholl If you see this working, then definitely the volume has mounted and the AzureAssignedIdentity
has been created.
The entire AzureAssignedIdentity
creation process gets triggered when we create the volume at pod creation time. All these components (find identity, assign identity, retrieve token for key vault) will take time to complete. If volume mount fails, the kubelet will retry. Hence you are seeing errors in the log but looks like eventually all the dependent components got created successfully.
Thank you @ritazh
On one hand it is good to get some reassurance that what I was seeing can be explained.
On the other hand, based on the assumption everything has worked because I can retrieve a secret, grateful for any suggestion to prevent the failed mount log against the pod. When describing the pod, the failed mount log would lead anyone to believe the process had failed even though we now know it didn't. Perhaps some way to write to further message of success when it eventually mounts? I'm keen to avoid having error messages that incorrectly describe the state of our pods in production.
@stevenicholl Can you please share the output of kubectl describe pod <name of the pod with the mounted secret>
then kubectl get events
. Please redact any sensitive information.
The failed messages will be added to the events until the volume has been successfully mounted. At the end of your events list on the pod, you should see Created container
and Started container
.
Hi @ritazh
Below is the output for the kubectl describe pod <name of the pod with the mounted secret>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23s default-scheduler Successfully assigned flexvoltest/nginx-flex-kv-podid to aks-agentpool-11854310-5
Warning FailedMount 22s kubelet, aks-agentpool-11854310-5 MountVolume.SetUp failed for volume "test" : mount command failed, status: Failure, reason: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, failed to get key vault
Normal Pulling 16s kubelet, aks-agentpool-11854310-5 pulling image "nginx"
Normal Pulled 15s kubelet, aks-agentpool-11854310-5 Successfully pulled image "nginx"
Normal Created 14s kubelet, aks-agentpool-11854310-5 Created container
Normal Started 14s kubelet, aks-agentpool-11854310-5 Started container
On running kubectl get events
against the namespace I get: (note this was executed a number of minutes after above so the timings won't perfectly match):
4m52s Normal Scheduled Pod Successfully assigned flexvoltest/nginx-flex-kv-podid to aks-agentpool-11854310-5
4m51s Warning FailedMount Pod MountVolume.SetUp failed for volume "test" : mount command failed, status: Failure, reason: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, failed to get key vault
4m45s Normal Pulling Pod pulling image "nginx"
4m44s Normal Pulled Pod Successfully pulled image "nginx"
4m43s Normal Created Pod Created container
4m43s Normal Started Pod Started container
In the cluster events I get a corresponding azure-identity-binding applying at exactly the time the container started:
4m43s Normal binding applied AzureIdentityBinding Binding azure-identity-binding applied on node aks-agentpool-11854310-5 for pod nginx-flex-kv-podid-flexvoltest-azure-identity
If I have understood correctly, this is what you were expecting to see and it confirms the behaviour described ie. it eventually successfully mounts? I do still think the K8's events are a bit misleading in that, whilst we know the secret mounted correctly, it still logs a failure.
@stevenicholl yes this is the expected behavior following the pod's creation lifecycle. Whenever a volume mount attempt has failed, the kubelet will retry mounting the failed volume and the failed event will be recorded. These events help us debug if any of the volume mounts failed. They are the history of pod, not current state of the pod. If any of the volume mounts failed, then the pod will not be created, its STATUS
will be ContainerCreating
. As you can see from your pod events, eventually as soon as the volume is mounted successfully, the container is created, then started. The pod's STATUS
is Running
.
@ritazh , thank you for confirmation. We will proceed on the basis that, if the pods start, the volumes have mounted. If they didn't mount correctly, the pod creation process will not have finished and will be stuck in a ContainerCreating
state.
@stevenicholl That's correct. Closing this issue for now, but feel free to reopen or file new issues if you have additional questions.
Thanks @ritazh , The only other thing I'd be interested to know if it is understood why the mounts initially fail? This only occurs for us when it is a KeyVault FlexVolume, no issues when mounting secrets to pods from Kubernetes secrets.
Currently implementing a test of Azure KV FlexVolumes with AAD PodId in an AKS cluster but experiencing some strange behaviour.
AAD PodId K8's components deployed using helm https://github.com/Azure/aad-pod-identity/tree/master/charts/aad-pod-identity & https://raw.githubusercontent.com/Azure/kubernetes-keyvault-flexvol/master/deployment/kv-flexvol-installer.yaml . Azure components (User Managed ID & access controls) deployed with Terraform. All of the Azure components seem to be fine.
A test pod with a volume mount and a test secret have been deployed using:
Things get interesting now.
mic log records the following:
nmi pod logging a 400, Identity not found:
Expecting to see logs to the effect of 10 & 12 detailed here regarding PodId https://github.com/Azure/aad-pod-identity/tree/master/charts/aad-pod-identity . Save this one for a bit later because it would appear from these logs that any lookup using PodId will fail. It doesn't.
The log related to the failed volume mount is:
Yet
kubectl exec -it nginx-flex-kv-podid cat /kv/password
successfully returns the correct value of the secret referenced. This suggests the AAD PodId is working fine and that all the permissions set to allow the AKS Service Principle to use the User Managed ID to authenticate to and read from KeyVault are also ok.Interesting conundrum. Grateful for any input that may help explain this behaviour.