Azure / kubernetes-keyvault-flexvol

Azure keyvault integration with Kubernetes via a Flex Volume
MIT License
253 stars 83 forks source link

Always getting ERROR: failed to get keyvaultClient: failed to get key vault token: failed to get service principal token #180

Closed pwc-sc closed 4 years ago

pwc-sc commented 4 years ago

Describe the bug The deployed pod , same as in your provided sample : nginx-flex-kv-podid always fails with ERROR: Unable to mount volumes for pod "nginx-flex-kv-podid_default(af5c5b2c-1a1c-4bc6-91da-f83479ec0c96)":timeout expired waiting for volumes to attach or mount for pod "default"/"nginx-flex-kv-podid". list of unmounted volumes=[test]. list of unattached volumes=[test default-token-s9ccf] Warning FailedMount 4s kubelet, aks-devnodepool-42371630-vmss000000 MountVolume.SetUp failed for volume "test" : mount command failed, status: Failure, reason: /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, F0307 13:46:13.048394 41097 main.go:82] [error] : failed to get keyvaultClient: failed to get key vault token: failed to get service principal token: nmi response failed with status code: 404 Steps To Reproduce Providing the yaml definitions as per instruction on

cat aadpodidentitybinding.yaml apiVersion: "aadpodidentity.k8s.io/v1" kind: AzureIdentityBinding metadata: name: a-idname spec: AzureIdentity: a-idname Selector: aadpodidbinding

cat aadpodidentity.yamlapiVersion: "aadpodidentity.k8s.io/v1" kind: AzureIdentity metadata: name: a-idname spec: type: 0 ResourceID: /subscriptions/xxxxxxxxxxx/resourcegroups/example/providers/Microsoft.ManagedIdentity/userAssignedIdentities/silviuchiric ClientID: xxxxxxxxxx Expected behavior Mount the serets from Azure Key-Vault into VolumeMount Key Vault FlexVolume version

Access mode: service principal or pod identity PodIdentity Kubernetes version 1.15 Additional context

pwc-sc commented 4 years ago

Dear team

I succedded to make ot working but only using the Service Principal as explained on: https://github.com/Azure/kubernetes-keyvault-flexvol#option-1-service-principal

The only question I have please: How can I used pods and deployments other then namespace default, the only one available to mount the volume with the secrets from KeyVault

"how to "please...

Nice weekend!

ritazh commented 4 years ago

@pwc-sc for pod identity mode: Can you please provide the following information?

  1. Output from kubectl get azureassignedidentities -o yaml. [Please redact any sensitive information from the output before you post it here].
  2. Can you also post the complete logs from MIC and NMI pods?

Feel free to search existing issues in https://github.com/Azure/aad-pod-identity or open an issue there.

pwc-sc commented 4 years ago

hello @ritazh These are the running resources in default namespace: kubectl get all --all-namespaces |grep -i identity default pod/aad-pod-identity-mic-5d48ccdf96-qq5gp 1/1 Running 0 3d22h default pod/aad-pod-identity-mic-5d48ccdf96-vxrxg 1/1 Running 0 3d22h

default deployment.apps/aad-pod-identity-mic 2/2 2 2 3d22h

default replicaset.apps/aad-pod-identity-mic-5d48ccdf96 2 2 2 3d22h

aramase commented 4 years ago

@pwc-sc Can you provide the output for the following data that @ritazh requested. The logs will provide details on any errors.

aramase commented 4 years ago

Closing this issue because of inactivity. Please feel free to reopen if you're still having issues.

pplavetzki commented 4 years ago

I've been able to duplicate the described bug in my environment. From the mic pod i receive the following errors in mic:

E0421 23:44:00.086777       1 vmss.go:75] compute.VirtualMachineScaleSetsClient#CreateOrUpdate: Failure sending request: StatusCode=403 -- Original Error: Code="AuthorizationFailed" Message="The client 'xxxxxxx-xxxx-xxxx-xxx-xxxxx' with object id 'xxxxx-xxxx-xxxxx-xxxxx-xxxxxx' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/write' over scope '/subscriptions/xxxxxxx-xxxxx-xxxxx-xxxx-xxxxxx/resourceGroups/MC_xx-xx_xx-xxx-dxxxev_w/providers/Microsoft.Compute/virtualMachineScaleSets/xxxxx-xxxxxx-xxxxxxxxx-vmss' or the scope is invalid. If access was recently granted, please refresh your credentials."
E0421 23:44:00.086829       1 mic.go:831] Updating msis on node xxx-xxxxxx-xxxxxx-vmss, add [1], del [0] failed with error compute.VirtualMachineScaleSetsClient#CreateOrUpdate: Failure sending request: StatusCode=403 -- Original Error: Code="AuthorizationFailed" Message="The client 'xxxxxxxx-xxxx-xxx-xxxx-xxxxxxx' with object id 'xxxxx-xxxx-xxxx-xxxxx-xxxx' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/write' over scope '/subscriptions/xxxxxx-xxxxx-xxxxx-xxxx-xxxxxx/resourceGroups/MC_xxx-xxx_xxx-xxx-dev_westus2/providers/Microsoft.Compute/virtualMachineScaleSets/xxx-xxx-xxxxxx-vmss' or the scope is invalid. If access was recently granted, please refresh your credentials."
E0421 23:44:00.164827       1 mic.go:848] Applying binding xxxx-xxx-xxx-xxxx node xxx-xxxx-xxxxxxx-vmss000001 for pod nginx-secrets-store-inline-podid-default-xxx-xxxx-azure-identity resulted in error compute.VirtualMachineScaleSetsClient#CreateOrUpdate: Failure sending request: StatusCode=403 -- Original Error: Code="AuthorizationFailed" Message="The client 'xxxxx-xxxxx-xxxx-xxxx-xxxxxx' with object id 'xxxxxxx-xxxxx-xxxx-xxxx-xxxxxx' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/write' over scope '/subscriptions/xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/MC_xxxxx_xxxxx_xxxx/providers/Microsoft.Compute/virtualMachineScaleSets/aks-xxxxx-xxxxx-vmss' or the scope is invalid. If access was recently granted, please refresh your credentials."

Also here is the log for nmi:

time="2020-04-22T14:24:21Z" level=error msg="no AzureAssignedIdentity found for pod:kube-system/omsagent-xscx7 in assigned state, context canceled" req.method=GET req.path=/metadata/identity/oauth2/token req.remote=50.50.0.11
time="2020-04-22T14:24:21Z" level=info msg="Status (404) took 60044516430 ns" req.method=GET req.path=/metadata/identity/oauth2/token req.remote=50.50.0.11

Access mode: service principal or pod identity

PodIdentity

Kubernetes version

1.16.7

aramase commented 4 years ago

@pplavetzki Looks like the SP/user assigned identity doesn't have the required Virtual Machine Contributor permissions on the vmss. If you're using a Managed identity enabled AKS cluster, the required role assignments for pod-identity are documented here - https://github.com/Azure/aad-pod-identity/blob/master/docs/readmes/README.msi.md. Once you provide the required role assignments, restart the MIC pods for the role assignments to take effect.

sabideep1 commented 4 years ago

Need help with this issue.

I am not getting below error while deploying the container.

Logs from Pod failed to get keyvaultClient: failed to get key vault token: nmi response failed with status code: 404

MIC logs: vmss.go:117] azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to

NMI logs: 1 server.go:191] Status (404) took 100003621803 ns for req.method=GET reg.path=/host/token/ req.remote=127.0.0.1

aramase commented 4 years ago

@sabideep1 Can you provide the logs from MIC pods?

kubectl get pods -l component=mic
kubectl logs <mic pod>

The MIC pods should contain logs as to why identity assignment failed. Please ensure you've provided all the required role assignments as documented here - https://github.com/Azure/aad-pod-identity/blob/master/docs/readmes/README.role-assignment.md and the AzureIdentity and AzureIdentityBinding have the correct case - https://github.com/Azure/aad-pod-identity#v160-breaking-change

sabideep1 commented 4 years ago

@aramase MIC logs: Failed to update VM with error azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/********/resourceGroups/****/providers/Microsoft.Compute/virtualMachineScaleSets/**: StatusCode=0 -- Original Error: adal: Failed to execute the refresh request. Error = 'Post "https://login.microsoftonline.com//oauth2/token?api-version=1.0": dial tcp: i/o timeout' E0619 18:00:10.859505 1 mic.go:1080] Updating msis on node *****, add [1], del [0], update[0] failed with error azure.BearerAuthorizer#WithAuthorization:

sabideep commented 4 years ago

@aramase ClusterResourceGroup:We have separate ResourceGroup for AKS and VMSS. Is this AKS or VMSS ClusterResourceGroup? I tested with AKS resourcegroup and it works. az role assignment create --role "Managed Identity Operator" --assignee --scope /subscriptions//resourcegroups/ClusterResourceGroup az role assignment create --role "Virtual Machine Contributor" --assignee --scope /subscriptions//resourcegroups/ClusterResourceGroup

aramase commented 4 years ago

@sabideep As documented here - https://github.com/Azure/aad-pod-identity/blob/master/docs/readmes/README.role-assignment.md#performing-role-assignments

For AKS cluster, the cluster resource group refers to the resource group with a MC_ prefix, which contains all of the infrastructure resources associated with the cluster like VM/VMSS.

the cluster resource group refers to MC_ rg where the VMSS are.

If you have identities in different rg than the cluster rg, then the additional required role assignments are documented here -https://github.com/Azure/aad-pod-identity/blob/master/docs/readmes/README.role-assignment.md#user-assigned-identities-that-are-not-within-the-cluster-resource-group