Azure / kubernetes-keyvault-flexvol

Azure keyvault integration with Kubernetes via a Flex Volume
MIT License
253 stars 84 forks source link

Brand new install - getting error "invalid character" #81

Closed dbilleci-lightstream closed 5 years ago

dbilleci-lightstream commented 5 years ago

Hello, I'm trying this out for the first time. I've followed the instructions, I have a new cluster just setup today, and I've put together these steps for the whole process on a brand new cluster.

    # Install these

    kubectl create -f https://raw.githubusercontent.com/Azure/kubernetes-keyvault-flexvol/master/deployment/kv-flexvol-installer.yaml
    kubectl create -f https://raw.githubusercontent.com/Azure/aad-pod-identity/master/deploy/infra/deployment-rbac.yaml

    # Add the existing identity with access to the keyvaults
    apiVersion: "aadpodidentity.k8s.io/v1"
    kind: AzureIdentity
    metadata:
     name: 0afbc123-3-eus2-uai
    spec:
     type: 0
     ResourceID: /subscriptions/11111111-1111-1111-1111-111111111111/resourcegroups/ManagedServiceIdentity-Stage-Eus2-Rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/0afbc123-3-eus2-uai
     ClientID: 22222222-2222-2222-2222-222222222222

    # adding 0afbc123 as a binding
    apiVersion: "aadpodidentity.k8s.io/v1"
    kind: AzureIdentityBinding
    metadata:
     name: 0afbc123-3-eus2-uai-binding
    spec:
     AzureIdentity: 0afbc123-3-eus2-uai
     Selector: 0afbc123-3-eus2-uai-selector

    # add a deployment that uses this stuff
    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        app: nginx-flex-kv-podid
        aadpodidbinding: "0afbc123-3-eus2-uai-selector"
      name: nginx-flex-kv-podid
    spec:
      containers:
      - name: nginx-flex-kv-podid
        image: nginx
        volumeMounts:
        - name: test
          mountPath: /kvmnt
          readOnly: true
      volumes:
      - name: test
        flexVolume:
          driver: "azure/kv"
          options:
            usepodidentity: "true"
            keyvaultname: "thetestkeyvault"
            keyvaultobjectnames: "TEST"
            keyvaultobjecttypes: secret
            resourcegroup: "TestKeyVault-Rg"
            subscriptionid: "11111111-1111-1111-1111-111111111111"
            tenantid: "33333333-3333-3333-3333-333333333333"

After I run this, the nginx pod won't start up, doing a describe shows this error:

    MountVolume.SetUp failed for volume "test" : invalid character 'I' looking for beginning of value

My identity 0afbc123-3-eus2-uai does have Reader on the thetestkeyvault, as well as the secrets/certs/keys - get/list permissions

Here's my describe on the pod:

    root@facfdb90317b:/data# kubectl describe pods/nginx-flex-kv-podid
    Name:         nginx-flex-kv-podid
    Namespace:    default
    Node:         aks-default-15419034-0/10.197.54.6
    Start Time:   Fri, 15 Feb 2019 02:35:09 +0000
    Labels:       aadpodidbinding=0afbc123-3-eus2-uai-selector
                  app=nginx-flex-kv-podid
    Annotations:  <none>
    Status:       Pending
    IP:           
    Containers:
      nginx-flex-kv-podid:
        Container ID:   
        Image:          nginx
        Image ID:       
        Port:           <none>
        Host Port:      <none>
        State:          Waiting
          Reason:       ContainerCreating
        Ready:          False
        Restart Count:  0
        Environment:
          KUBERNETES_PORT_443_TCP_ADDR:  mycontainer.hcp.eastus2.azmk8s.io
          KUBERNETES_PORT:               tcp://mycontainer.hcp.eastus2.azmk8s.io:443
          KUBERNETES_PORT_443_TCP:       tcp://mycontainer.hcp.eastus2.azmk8s.io:443
          KUBERNETES_SERVICE_HOST:       mycontainer.hcp.eastus2.azmk8s.io
        Mounts:
          /kvmnt from test (ro)
          /var/run/secrets/kubernetes.io/serviceaccount from default-token-sxp96 (ro)
    Conditions:
      Type           Status
      Initialized    True 
      Ready          False 
      PodScheduled   True 
    Volumes:
      test:
        Type:       FlexVolume (a generic volume resource that is provisioned/attached using an exec based plugin)
        Driver:     azure/kv
        FSType:     
        SecretRef:  nil
        ReadOnly:   false
        Options:    map[keyvaultobjecttypes:secret resourcegroup:TestKeyVault-Rg subscriptionid:11111111-1111-1111-1111-111111111111 tenantid:22222222-2222-2222-2222-222222222222 usepodidentity:true keyvaultname:thetestkeyvault keyvaultobjectnames:TEST]
      default-token-sxp96:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  default-token-sxp96
        Optional:    false
    QoS Class:       BestEffort
    Node-Selectors:  <none>
    Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                     node.kubernetes.io/unreachable:NoExecute for 300s
    Events:
      Type     Reason                 Age                From                             Message
      ----     ------                 ----               ----                             -------
      Normal   Scheduled              53s                default-scheduler                Successfully assigned nginx-flex-kv-podid to aks-default-15419034-0
      Normal   SuccessfulMountVolume  52s                kubelet, aks-default-15419034-0  MountVolume.SetUp succeeded for volume "default-token-sxp96"
      Warning  FailedMount            20s (x7 over 52s)  kubelet, aks-default-15419034-0  MountVolume.SetUp failed for volume "test" : invalid character 'I' looking for beginning of value

Thanks!

dbilleci-lightstream commented 5 years ago

Here's the log from the node:


    $ tail -f /var/log/kv-driver.log

    Fri Feb 15 03:28:12 UTC 2019 mount
    Fri Feb 15 03:28:12 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret                                                                               
    Fri Feb 15 03:28:16 UTC 2019 umount
    Fri Feb 15 03:28:16 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, Fri Feb 15 03:28:12 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret "}
    Fri Feb 15 03:30:18 UTC 2019 ismounted | not mounted
    Fri Feb 15 03:30:18 UTC 2019 PODNAME: nginx-flex-kv-podid
    Fri Feb 15 03:30:18 UTC 2019 mount
    Fri Feb 15 03:30:18 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret                                                                               
    Fri Feb 15 03:30:38 UTC 2019 umount
    Fri Feb 15 03:30:38 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, Fri Feb 15 03:30:18 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret "}
    Fri Feb 15 03:32:40 UTC 2019 ismounted | not mounted
    Fri Feb 15 03:32:40 UTC 2019 PODNAME: nginx-flex-kv-podid
    Fri Feb 15 03:32:40 UTC 2019 mount
    Fri Feb 15 03:32:40 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret
    Fri Feb 15 03:32:41 UTC 2019 umount
    Fri Feb 15 03:32:41 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, Fri Feb 15 03:32:40 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret "}
    Fri Feb 15 03:34:43 UTC 2019 ismounted | not mounted
    Fri Feb 15 03:34:43 UTC 2019 PODNAME: nginx-flex-kv-podid
    Fri Feb 15 03:34:43 UTC 2019 mount
    Fri Feb 15 03:34:43 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret
    Fri Feb 15 03:34:43 UTC 2019 umount
    Fri Feb 15 03:34:43 UTC 2019 ERROR: {"status": "Failure", "message": "/etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume failed, Fri Feb 15 03:34:43 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=thetestkeyvault -vaultObjectNames=TEST -resourceGroup=TestKeyVault-Rg -dir=/var/lib/kubelet/pods/5042e81c-30ca-11e9-a56b-3af4beca0834/volumes/azure~kv/test -subscriptionId=11111111-1111-1111-1111-111111111111 -cloudName= -tenantId=22222222-2222-2222-2222-222222222222 -aADClientSecret= -aADClientID= -usePodIdentity=true -podNamespace=default -podName=nginx-flex-kv-podid -vaultObjectVersions= -vaultObjectTypes=secret "}
    Fri Feb 15 03:36:45 UTC 2019 ismounted | not mounted
    Fri Feb 15 03:36:45 UTC 2019 PODNAME: nginx-flex-kv-podid
ritazh commented 5 years ago

@dbilleci-lightstream Can you please provide the logs from the mic pod and the nmi pod running on the same host as your pod? aks-default-15419034-0

Another thing to check is if your identity is created in a different resource group as that of the AKS nodes (prefixed with 'MC_' ) then make sure you run the following to ensure your AKS service principal has the managed identity role to assign permissions to your azure identity 0afbc123-3-eus2-uai

az role assignment create --role "Managed Identity Operator" --assignee <sp id> --scope <full id of the managed identity> For more details, refer to: https://github.com/Azure/aad-pod-identity#providing-required-permissions-for-mic

dbilleci-lightstream commented 5 years ago

Hi @ritazh thank you for your reply. Yes, the SPN attached to the AKS cluster is not part of the MC_ group. I checked the logs on the mic pod and saw the error saying that the SPN did not have authorization to write userAction to the identity as you predicted.

I0215 16:27:42.096947       1 event.go:218] Event(v1.ObjectReference{Kind:"AzureIdentityBinding", Namespace:"default", Name:"0afbc123-3-eus2-uai-binding", UID:"317e14d2-30bc-11e9-a56b-3af4beca0834", APIVersion:"aadpodidentity.k8s.io/v1", ResourceVersion:"1160", FieldPath:""}): type: 'Warning' reason: 'binding apply error' Applying binding 0afbc123-3-eus2-uai-binding node aks-default-15419034-0 for pod nginx-flex-kv-podid-default-0afbc123-3-eus2-uai resulted in error compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=403 -- Original Error: Code="LinkedAuthorizationFailed" Message="The client 'abcdabcd-aaaa-bbbb-cccc-ddddeeeeffff' with object id 'abcdabcd-aaaa-bbbb-cccc-ddddeeeeffff' has permission to perform action 'Microsoft.Compute/virtualMachines/write' on scope '/subscriptions/11111111-1111-1111-1111-111111111111/resourceGroups/MC_Backend-X-Stage-Eus2-Rg_Backend-X-Stage-Eus2-Aks_eastus2/providers/Microsoft.Compute/virtualMachines/aks-default-15419034-0'; however, it does not have permission to perform action 'Microsoft.ManagedIdentity/userAssignedIdentities/assign/action' on the linked scope(s) '/subscriptions/11111111-1111-1111-1111-111111111111/resourcegroups/ManagedServiceIdentity-Stage-Eus2-Rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/0afbc123-3-eus2-uai'."

The reason I missed this step is that the README says this about installing aad-pod-identity:

"Deploy pod identity components to your cluster Follow *these steps* to install pod identity." 

However not all steps are needed, and it is difficult to decide at what point to stop when you are new to the project. I would suggest we update the documentation there. I can help with a PR for this!

Next, I added the "Managed Identity Operator" permission to the AKS SPN, but the error message did not change at all.

$ az role assignment create --role "Managed Identity Operator" --assignee abcdabcd-aaaa-bbbb-cccc-ddddeeeeffff --scope /subscriptions/11111111-1111-1111-1111-111111111111/resourcegroups/ManagedServiceIdentity-Stage-Eus2-Rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/0afbc123-3-eus2-uai                
{
  "canDelegate": null,
  "id": "/subscriptions/11111111-1111-1111-1111-111111111111/resourcegroups/ManagedServiceIdentity-Stage-Eus2-Rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/0afbc123-3-eus2-uai/providers/Microsoft.Authorization/roleAssignments/7eca8214-bca1-4dea-99e3-f750fc04d21f",                                     
  "name": "7eca8214-bca1-4dea-99e3-f750fc04d21f",
  "principalId": "abcdabcd-aaaa-bbbb-cccc-ddddeeeeffff",
  "resourceGroup": "ManagedServiceIdentity-Stage-Eus2-Rg",
  "roleDefinitionId": "/subscriptions/11111111-1111-1111-1111-111111111111/providers/Microsoft.Authorization/roleDefinitions/f1a07417-d97a-45cb-824c-7a7467783830",                                                                                                                                                          
  "scope": "/subscriptions/11111111-1111-1111-1111-111111111111/resourcegroups/ManagedServiceIdentity-Stage-Eus2-Rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/0afbc123-3-eus2-uai",                                                                                                                         
  "type": "Microsoft.Authorization/roleAssignments"
}

I'm going to try and destroy everything and start from scratch again, I want to make sure this process works from the ground up each time as it will be deployed many times for us.

Do you know the reason why I would still be getting the error message even after adding the permission? I would think the change would be immediate? I'll post my results of the next attempt as well.

Thank you!

ritazh commented 5 years ago

@dbilleci-lightstream

I would suggest we update the documentation there. I can help with a PR for this!

Yes this step should definitely be mentioned in the README. PRs are definitely welcome! 👍 If you have more issues on pod identity and if you are looking for more detailed steps, you can also checkout the pod identity repo: https://github.com/Azure/aad-pod-identity.

why I would still be getting the error message even after adding the permission

I have seen the permission assignment takes few minutes to kick in. You might also want to redeploy all the pod identity components just to be safe. One way to verify that the pod identity is working is to ensure the identity has been assigned to your node after you deployed the pod. Here is what it looks like from the Azure portal:

screen shot 2019-02-15 at 8 53 04 am

dbilleci-lightstream commented 5 years ago

It worked on the new cluster I spun up! I waited about 5+ minutes after applying the permission on the old cluster, but the permission didn't kick in yet, I should have waited 30 minutes to make sure..sorry about that. But, a new cluster did work.

I see the identity mapped as you have shown in your image there to the VM node in the agentpool.

I might be able to run that same test again in a little to get the answer to those questions for you.

Thank you so much for all of your help, this is going to work great for us!

ritazh commented 5 years ago

@dbilleci-lightstream Glad it's working for you. Closing this issue. But feel free to reopen or create new ones if you have other questions.

dbilleci-lightstream commented 5 years ago

Ok thank you! Here's the PR for the docs mini-refresh, I did it from my personal account rather than my work account. https://github.com/Azure/kubernetes-keyvault-flexvol/pull/82

Thanks!