Azure / kubernetes-keyvault-flexvol

Azure keyvault integration with Kubernetes via a Flex Volume
MIT License
253 stars 83 forks source link

Container stuck in "Container Creating" trying to mount flex-volume #141

Closed andloh closed 4 years ago

andloh commented 4 years ago

Describe the bug Container stuck in Container Creating trying to mount flex-volume. kv-driver.log= ERROR: {"status": "Failure", "message": "validation failed, tenantid is empty"} and sometimes {"status": "Success", "capabilities": {"attach": false}}

testpod: (using this as a template https://github.com/Azure/kubernetes-keyvault-flexvol/blob/master/deployment/nginx-flex-kv-podidentity.yaml)

apiVersion: v1
kind: Pod
metadata:
  name: nginx-flex-kv
  namespace: ns1
  labels:
    aadpodidbinding: secrets
spec:
  containers:
  - name: nginx-flex-kv
    image: nginx
    volumeMounts:
    - name: test
      mountPath: "/kvmnt"
      readOnly: true
  volumes:
  - name: test
    flexVolume:
      driver: "azure/kv"
      options:
        usepodidentity: "true"
        keyvaultname: "azure-key-vault-1"
        keyvaultobjectnames: "test"
        keyvaultobjecttypes: secret
        resourcegroup: "azure-key-vault-rg"
        subscriptionid: "subscriptionid"
        tenantid: "tenantid"

Steps To Reproduce Install flexvol and AAD Pod Identity according to the documentation

Expected behavior flexvolum gets mountet

Key Vault FlexVolume version mcr.microsoft.com/k8s/flexvolume/keyvault-flexvolume:v0.0.15

Access mode: service principal or pod identity pod identity

Kubernetes version v1.14.6

Additional context Using Docker Enterprise in azure using: https://github.com/Azure/aad-pod-identity/blob/master/docs/readmes/README.namespaced.md

The User assigned identity does get assignet to the vm that the pod is trying to run on. Nothing interessering in the MIC or NMI logs. Same with NMI debug enabled.

The pod trying to mount the secret, keyvault-flexvolume and NMI & MIC are running in there own namespaces

aramase commented 4 years ago

@andloh From the logs you posted, looks like the tenantid is not set in the test pod you are trying? Can you confirm if all the required parameters are set as expected?

Once all the required parameters are set, the error message will no longer appear in the kv-driver.log.

For pod-identity, can you check if an azureassignedidentity has been created for the pod? kubectl get azureassignedidentity -n <namespace>. If you don't see an assigned identity yet, you can look at the logs for MIC - kubectl logs <mic pod>. MIC runs as a deployment with 2 replicas and at any given time one MIC pod is elected leader and performing operations. So to view the logs from both MIC pods, you can run kubectl logs -l component=mic (this should give you logs from both pods).

andloh commented 4 years ago

Hi @aramase Thanks for the quick reply. I can confirm that the tenantid parameter is set, and has the correct value. The pod requesting the secret also get's an azureassignedidentity

k get azureassignedidentity

NAME                                              AGE
nginx-flex-kv-customname-test-aad-pod-identity   31m
Updating user assigned MSIs on $node
Updating assigned identity ns1/nginx-flex-kv-customname-test-aad-pod-identity status to Assigned
Work done: true. Found 1 pods, 1 ids, 1 bindings

according to the code at kubernetes-keyvault-flexvol/deployment/flexvol-installer/kv it's looks like the tenantid gets checked if its empty first.

    if [ -z "${TENANT_ID}" ]; then
        err "{\"status\": \"Failure\", \"message\": \"validation failed, tenantid is empty\"}"
        exit 1

So its possible that the flexvol can't read the values from the pod at all, because it triggers on the first parameter. I feel I have tried everything...so for me this seems like some kind of bug :/

Here its my AzureIdentity config, if it helps :) :

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
  name: test-aad-pod-identity
  namespace: ns1
  annotations:
    aadpodidentity.k8s.io/Behavior: namespaced
spec:
  type: 0
  ResourceID: /subscriptions/subid/resourcegroups/azure-key-vault-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/test-aad-pod-identity
  ClientID: test-aad-pod-identity's clientid
aramase commented 4 years ago

Updating assigned identity ns1/nginx-flex-kv-customname-test-aad-pod-identity status to Assigned

Looks like aad-pod-identity successfully assigned the identity to the underlying node. So there are no errors with respect to pod-identity.

So its possible that the flexvol can't read the values from the pod at all, because it triggers on the first parameter.

I can't seem to recreate this. Was the tenantid param missing during initial deploy? The exact command that's invoked to mount is printed in the kv-driver.log. This should show which all params are missing.

andloh commented 4 years ago

Thanks for the quick response again @aramase

No it was not missing at the initial deploy and I have redeployed the whole stack 2 times. That includes deleting all the associated namespaces, for aad-pod-identity and keyvault-flexvol.

I am using the correct volume settings here? I had to change the hostPath vaule to /usr/libexec/kubernetes/kubelet-plugins/volume/exec from /etc/kubernetes/volumeplugins

With /etc/kubernetes/volumeplugins the log on the host wasn't even there.

env:
          # if you have used flex before on your cluster, use same directory
          # set TARGET_DIR env var and mount the same directory to to the container
        - name: TARGET_DIR
          value: "/usr/libexec/kubernetes/kubelet-plugins/volume/exec"
        volumeMounts:
        - mountPath: "/usr/libexec/kubernetes/kubelet-plugins/volume/exec"
          name: volplugins
      volumes:
      - hostPath:
          # Modify this directory if your nodes are using a different one
          # default is "/usr/libexec/kubernetes/kubelet-plugins/volume/exec"
          # below is Azure default
          path: "/usr/libexec/kubernetes/kubelet-plugins/volume/exec"
        name: volplugins

I have tried with /etc/kubernetes/volumeplugins as vaule for mountPath and TARGET_DIR aswell. Still the same error:


ERROR: {"status": "Failure", "message": "validation failed, tenantid is empty"}```
aramase commented 4 years ago

@andloh can you ensure there are no extra spaces after each value? I was just able to recreate this by adding a whitespace after subscription id value.

        subscriptionid: "subid "              
        tenantid: "tenantid"                  

the whitespace after subid ^^ causes the issue.

andloh commented 4 years ago

Hmmmm @aramase I have actually checked the all the parameters for whitespace. I will check again tomorrow, just to make sure.

Can you confirm that I am using the correct volume mounts in the comment above?

Thanks again for the quick replies :)

aramase commented 4 years ago

@andloh If you are using an azure (AKS or aks-engine) cluster, then the path needs to be /etc/kubernetes/volumeplugins/ as kubelet is configured to use that as the volume-plugin-dir.

andloh commented 4 years ago

Hi @aramase Still no success. Tried to redeploy the whole stack multiple times. Have also recreatet the testfile, https://github.com/Azure/kubernetes-keyvault-flexvol/blob/master/deployment/nginx-flex-kv-podidentity.yaml

Did you test this on k8s v1.14.6?

Any other suggestions? Is it possible to simulate the mount in any way? Run the command or code manually ?

aramase commented 4 years ago

@andloh yes, I've tried on a 1.14.6 cluster and am unable to repro this. This is the yaml I'm using -

apiVersion: v1
kind: Pod
metadata:
  labels:
    app: nginx-flex-kv-podid
    aadpodidbinding: demo
  name: nginx-flex-kv-podid
  namespace: default
spec:
  containers:
    - name: nginx-flex-kv-podid
      image: nginx
      volumeMounts:
        - name: test
          mountPath: /kvmnt
          readOnly: true
  volumes:
    - name: test
      flexVolume:
        driver: "azure/kv"
        options:
          usepodidentity: "true" # [OPTIONAL] if not provided, will default to "false"
          keyvaultname: "cluster02" # the name of the KeyVault
          keyvaultobjectnames: "secret1" # list of KeyVault object names (semi-colon separated)
          keyvaultobjecttypes: secret # list of KeyVault object types: secret, key or cert (semi-colon separated)
          keyvaultobjectversions: "" # [OPTIONAL] list of KeyVault object versions (semi-colon separated), will get latest if empty
          resourcegroup: "cluster02" # the resource group of the KeyVault
          subscriptionid: "00000000-0000-0000-0000-000000000000" # the subscription ID of the KeyVault
          tenantid: "00000000-0000-0000-0000-000000000000"
andloh commented 4 years ago

Thanks for the response @aramase

I do get this message when I take a describe on the "Container Creating" stuck pod. MountVolume.SetUp failed for volume "test" : invalid character '/' looking for beginning of value Any ideas?

Using the same test pod:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-flex-kv
  namespace: ns1
  labels:
    aadpodidbinding: secrets
spec:
  containers:
  - name: nginx-flex-kv
    image: nginx
    volumeMounts:
    - name: test
      mountPath: "/kvmnt"
      readOnly: true
  volumes:
  - name: test
    flexVolume:
      driver: "azure/kv"
      options:
        usepodidentity: "true"
        keyvaultname: "azure-key-vault-1"
        keyvaultobjectnames: "test"
        keyvaultobjecttypes: secret
        keyvaultobjectversions: ""
        resourcegroup: "azure-key-vault-rg"
        subscriptionid: "subscriptionid"
        tenantid: "tenantid"

Same error from kv.log: ERROR: {"status": "Failure", "message": "validation failed, tenantid is empty"}

aramase commented 4 years ago

@andloh can you please remove the quotes around the mount path?

    volumeMounts:
    - name: test
      mountPath: "/kvmnt" <--- try just using /kvmnt
      readOnly: true
andloh commented 4 years ago

@aramase same error...when removing the quotes :/ MountVolume.SetUp failed for volume "test" : invalid character '/' looking for beginning of value

    - name: test
      mountPath: /kvmnt
      readOnly: true
aramase commented 4 years ago

@andloh I tried the exact same yaml you posted with my values and can't seem to reproduce it -

➜ kubectl exec -ti nginx-flex-kv -- cat /kvmnt/secret1
test-value%

The only ways I can reproduce this is -

  1. If a required value is not provided
  2. If there is an additional whitespace in any of the values
  3. If any of the keys in options are misspelled
  4. If keyvaultobjectversions is empty, but instead of "" has " ".

Also, if you are using 0.0.15 version, then rg and subscription id are OPTIONAL - https://github.com/Azure/kubernetes-keyvault-flexvol/blob/master/deployment/nginx-flex-kv-podidentity.yaml#L26-L27

aramase commented 4 years ago

@andloh Has the issue been resolved?

andloh commented 4 years ago

Hi, I dont know. I haven't tried since we last talked. I can try again some time soon, I will let you know :)

andloh commented 4 years ago

Hi, still no luck with this... Maybe I will try it again some other time, when I am on AKS. In this case it's Docker Enterprise on Azure.

Thanks for help anyways :)