Azure / kubernetes-keyvault-flexvol

Azure keyvault integration with Kubernetes via a Flex Volume
MIT License
253 stars 84 forks source link

kv flex-volume mounting after multiple failures #67

Closed hjarraya closed 5 years ago

hjarraya commented 5 years ago

Hi,

I am having an issue that I am not able to explain/isolate. When the pod starts it fails to mount and then after few minute it successfully mount the kv-flexvolume, this is not consistent sometimes it never mounts.

Events:
  Type     Reason       Age                   From                             Message
  ----     ------       ----                  ----                             -------
  Normal   Scheduled    12m                   default-scheduler                Successfully assigned externaldns/external-dns-64c4488cd9-7bzkl to aks-default-38594516-0
  Warning  FailedMount  8m45s (x10 over 12m)  kubelet, aks-default-38594516-0  MountVolume.SetUp failed for volume "azurekv" : mount command failed, status: Failure, reason: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, failed to get key vault
  Warning  FailedMount  8m38s (x2 over 10m)   kubelet, aks-default-38594516-0  Unable to mount volumes for pod "external-dns-64c4488cd9-7bzkl_externaldns(e6f19211-2a35-11e9-8203-32eed183208c)": timeout expired waiting for volumes to attach or mount for pod "externaldns"/"external-dns-64c4488cd9-7bzkl". list of unmounted volumes=[azurekv]. list of unattached volumes=[azurekv external-dns-token-6bbqw]
  Normal   Pulling      6m33s                 kubelet, aks-default-38594516-0  pulling image "registry.opensource.zalan.do/teapot/external-dns:v0.5.9"
  Normal   Pulled       6m32s                 kubelet, aks-default-38594516-0  Successfully pulled image "registry.opensource.zalan.do/teapot/external-dns:v0.5.9"
  Normal   Created      6m29s                 kubelet, aks-default-38594516-0  Created container
  Normal   Started      6m29s                 kubelet, aks-default-38594516-0  Started container

Here is the logs from the aks-default-38594516-0

`XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:12:01.865336    5671 main.go:98] starting the azurekeyvault-flexvolume, 0.0.6
I0206 18:12:01.865388    5671 main.go:275] subscriptionID: XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:12:01.865392    5671 main.go:276] vaultName: dev-aurora-vault
I0206 18:12:01.865395    5671 main.go:277] resourceGroup: aurora
I0206 18:12:01.865419    5671 oauth.go:135] azure: using pod identity to retrieve token

[error] failed to get key vault, error: failed to get vault, error: keyvault.VaultsClient#Get: Failure responding to request: StatusCode=401 -- Original Error: autorest/azure: Service returned an error. Status=401 Code="AuthenticationFailed" Message="Authentication failed. The 'Authorization' header is missing."

 failed to get key vault
Wed Feb 6 18:12:01 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, failed to get key vault "}
Wed Feb  6 18:12:18 UTC 2019 ismounted | /var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv does not exist
Wed Feb  6 18:12:18 UTC 2019 PODNAME: external-dns-64c4488cd9-jllvm
Wed Feb  6 18:12:18 UTC 2019 EXEC: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=dev-aurora-vault -vaultObjectNames=azure-config -resourceGroup=aurora -dir=/var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv -subscriptionId=XXXX-XXXX-XXXX-XXXX-XXXX -cloudName= -tenantId=XXXXXX-XXXX-XXXX-XXXX-XXXXXXX -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=ambassador -podName=external-dns-64c4488cd9-jllvm -vaultObjectVersions= -vaultObjectTypes=secret
dev-aurora-vault
azurekeyvault-flexvolume 0.0.6
XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:12:18.017379    6140 main.go:98] starting the azurekeyvault-flexvolume, 0.0.6
I0206 18:12:18.017441    6140 main.go:275] subscriptionID: XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:12:18.017446    6140 main.go:276] vaultName: dev-aurora-vault
I0206 18:12:18.017450    6140 main.go:277] resourceGroup: aurora
I0206 18:12:18.017486    6140 oauth.go:135] azure: using pod identity to retrieve token

[error] failed to get key vault, error: failed to get vault, error: keyvault.VaultsClient#Get: Failure responding to request: StatusCode=401 -- Original Error: autorest/azure: Service returned an error. Status=401 Code="AuthenticationFailed" Message="Authentication failed. The 'Authorization' header is missing."

 failed to get key vault
Wed Feb 6 18:12:18 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, failed to get key vault "}
Wed Feb  6 18:12:50 UTC 2019 ismounted | /var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv does not exist
Wed Feb  6 18:12:50 UTC 2019 PODNAME: external-dns-64c4488cd9-jllvm
Wed Feb  6 18:12:50 UTC 2019 EXEC: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=dev-aurora-vault -vaultObjectNames=azure-config -resourceGroup=aurora -dir=/var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv -subscriptionId=XXXX-XXXX-XXXX-XXXX-XXXX -cloudName= -tenantId=XXXXXX-XXXX-XXXX-XXXX-XXXXXXX -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=ambassador -podName=external-dns-64c4488cd9-jllvm -vaultObjectVersions= -vaultObjectTypes=secret
dev-aurora-vault
azurekeyvault-flexvolume 0.0.6
XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:12:50.208958    6881 main.go:98] starting the azurekeyvault-flexvolume, 0.0.6
I0206 18:12:50.209012    6881 main.go:275] subscriptionID: XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:12:50.209016    6881 main.go:276] vaultName: dev-aurora-vault
I0206 18:12:50.209019    6881 main.go:277] resourceGroup: aurora
I0206 18:12:50.209043    6881 oauth.go:135] azure: using pod identity to retrieve token

[error] failed to get key vault, error: failed to get vault, error: keyvault.VaultsClient#Get: Failure responding to request: StatusCode=401 -- Original Error: autorest/azure: Service returned an error. Status=401 Code="AuthenticationFailed" Message="Authentication failed. The 'Authorization' header is missing."

 failed to get key vault
Wed Feb 6 18:12:50 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, failed to get key vault "}
Wed Feb  6 18:13:54 UTC 2019 ismounted | /var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv does not exist
Wed Feb  6 18:13:54 UTC 2019 PODNAME: external-dns-64c4488cd9-jllvm
Wed Feb  6 18:13:54 UTC 2019 EXEC: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=dev-aurora-vault -vaultObjectNames=azure-config -resourceGroup=aurora -dir=/var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv -subscriptionId=XXXX-XXXX-XXXX-XXXX-XXXX -cloudName= -tenantId=XXXXXX-XXXX-XXXX-XXXX-XXXXXXX -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=ambassador -podName=external-dns-64c4488cd9-jllvm -vaultObjectVersions= -vaultObjectTypes=secret
dev-aurora-vault
azurekeyvault-flexvolume 0.0.6
XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:13:54.398159    8254 main.go:98] starting the azurekeyvault-flexvolume, 0.0.6
I0206 18:13:54.398225    8254 main.go:275] subscriptionID: XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:13:54.398229    8254 main.go:276] vaultName: dev-aurora-vault
I0206 18:13:54.398233    8254 main.go:277] resourceGroup: aurora
I0206 18:13:54.398261    8254 oauth.go:135] azure: using pod identity to retrieve token

[error] failed to get key vault, error: failed to get vault, error: keyvault.VaultsClient#Get: Failure responding to request: StatusCode=401 -- Original Error: autorest/azure: Service returned an error. Status=401 Code="AuthenticationFailed" Message="Authentication failed. The 'Authorization' header is missing."

 failed to get key vault
Wed Feb 6 18:13:54 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, failed to get key vault "}
Wed Feb  6 18:15:56 UTC 2019 ismounted | /var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv does not exist
Wed Feb  6 18:15:56 UTC 2019 PODNAME: external-dns-64c4488cd9-jllvm
Wed Feb  6 18:15:56 UTC 2019 EXEC: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=dev-aurora-vault -vaultObjectNames=azure-config -resourceGroup=aurora -dir=/var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv -subscriptionId=XXXX-XXXX-XXXX-XXXX-XXXX -cloudName= -tenantId=XXXXXX-XXXX-XXXX-XXXX-XXXXXXX -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=ambassador -podName=external-dns-64c4488cd9-jllvm -vaultObjectVersions= -vaultObjectTypes=secret
dev-aurora-vault
azurekeyvault-flexvolume 0.0.6
XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:15:56.656382   11666 main.go:98] starting the azurekeyvault-flexvolume, 0.0.6
I0206 18:15:56.656482   11666 main.go:275] subscriptionID: XXXX-XXXX-XXXX-XXXX-XXXX
I0206 18:15:56.656486   11666 main.go:276] vaultName: dev-aurora-vault
I0206 18:15:56.656489   11666 main.go:277] resourceGroup: aurora
I0206 18:15:56.656515   11666 oauth.go:135] azure: using pod identity to retrieve token

 accesstoken: eyJ0##### REDACTED #####i3IQ

 clientid: 59f7##### REDACTED #####73e5
I0206 18:15:57.232027   11666 oauth.go:135] azure: using pod identity to retrieve token

 accesstoken: eyJ0##### REDACTED #####QreQ

 clientid: 59f7##### REDACTED #####73e5
I0206 18:15:57.339939   11666 main.go:132] retrieving secret azure-config (version: )
I0206 18:15:57.512055   11666 main.go:176] azure KeyVault wrote secret azure-config at /var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv
Wed Feb  6 18:15:57 UTC 2019 EXEC: ls /var/lib/kubelet/pods/a95407df-2a3a-11e9-8203-32eed183208c/volumes/azure~kv/azurekv
azure-config
Wed Feb 6 18:15:57 UTC 2019 INFO: {"status": "Success"}
ritazh commented 5 years ago

@hjarraya The entire AzureAssignedIdentity creation process gets triggered when we create the volume at pod creation time. All these components (find identity, assign identity, retrieve token for key vault) will take time to complete. If volume mount fails, the kubelet will retry. Hence you are seeing errors in the log but looks like eventually all the dependent components got created successfully. So this is expected. Hope this helps.

juona commented 5 years ago

Thank you for the explanation @ritazh , now we know not to investigate this. Gotta second @hjarraya though in that the behaviour does not seem consistent nor reliable; doesn't feel like it's purely a documentation issue.

ritazh commented 5 years ago

closed via #94

ritazh commented 5 years ago

This experience can be improved by fixing issue https://github.com/Azure/aad-pod-identity/issues/181

kkmsft commented 5 years ago

Currently the key vault calls to NMI is not retried by the key vault. The retry is getting added to the NMI calls themselves in https://github.com/Azure/aad-pod-identity/issues/181. Re opening this issue to ensure that we retry in the key vault itself when it tries to attempt reaching the NMI fails. This will give key vault flex volume ability to tune its retry attempts to ensure that the kubelet does not time out the mount attempt and make it more resilient from underlying vm/vmss assignment attempts by aad pod identity.

ritazh commented 5 years ago

Closed via #102