Azure / kubernetes-keyvault-flexvol

Azure keyvault integration with Kubernetes via a Flex Volume
MIT License
253 stars 84 forks source link

Flex-volume with pod-identity fails to mount multiple times with 403 error then succeeds #76

Closed samisq closed 5 years ago

samisq commented 5 years ago

When deploying/restarting a pod with keyvault flex-volume, volume mounting fails multiple times with 403, but it eventually succeeds after a few retries. Is that an expected behavior? Is there way to mitigate it and avoid this error?

Error details:

MountVolume.SetUp failed for volume "keyvault-secrets" : mount command failed, status: Failure, reason: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, F0213 07:28:59.049161 19766 main.go:80] [error] : failed to get keyvaultClient: failed to get key vault token: nmi response failed with status code: 403

NMI logs:

time="2019-02-13T07:28:35Z" level=info msg="Rules for table(nat) chain(aad-metadata) rules(-N aad-metadata, -A aad-metadata ! -s 127.0.0.1/32 -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 10.0.240.166:2579, -A aad-metadata -j RETURN)"
time="2019-02-13T07:28:59Z" level=error msg="no AzureAssignedIdentity found for pod:ecm-prod/ecm-datl-sec-manual-01-rzlkd, <nil>" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:28:59Z" level=info msg="Status (403) took 52610486 ns" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:28:59Z" level=error msg="no AzureAssignedIdentity found for pod:ecm-prod/ecm-datl-sec-manual-01-rzlkd, <nil>" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:28:59Z" level=info msg="Status (403) took 7896683 ns" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:28:59Z" level=error msg="no AzureAssignedIdentity found for pod:ecm-prod/ecm-datl-sec-manual-01-rzlkd, <nil>" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:28:59Z" level=info msg="Status (403) took 7797379 ns" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:28:59Z" level=error msg="no AzureAssignedIdentity found for pod:ecm-prod/ecm-datl-sec-manual-01-rzlkd, <nil>" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:28:59Z" level=info msg="Status (403) took 7302061 ns" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:29:00Z" level=info msg="matched identityType:0 clientid:545386ea-2886-4714-8960-ce274cc8563b resource:https://management.azure.com/" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:29:00Z" level=info msg="Status (200) took 16266383 ns" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
time="2019-02-13T07:29:01Z" level=info msg="matched identityType:0 clientid:545386ea-2886-4714-8960-ce274cc8563b resource:https://vault.azure.net" req.method=GET req.path=/host/token/ req.remote=127.0.0.1
borqosky commented 5 years ago

confirm that I have exactly the same, some logs with just newest kv-flexvolume driver:

 Warning  FailedMount  22m   kubelet, aks-agentpool-13285615-1  MountVolume.SetUp failed for volume "secrets" : mount command failed, status: Failure, reason: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, F0213 12:28:30.938610 19232 main.go:80] [error] : failed to get keyvaultClient: failed to get key vault token: nmi response failed withstatus code: 403

NMI:

time="2019-02-13T12:41:28Z" level=error msg="failed to get service principal token for pod:m****y/vault-excited-liger-54769476f6-tztlk, adal: Refresh request failed. Status Code = '400'. Response body: {\"error\":\"invalid_request\",\"error_description\":\"Identity not found\"}" req.method=GET req.path=/host/token/ req.remote=127.0.0.1

might be problem with ADAL/AD ?

ritazh commented 5 years ago

duplicate of #67

ritazh commented 5 years ago

closed via #94