hashicorp / vault-k8s

First-class support for Vault and Kubernetes.
Mozilla Public License 2.0
784 stars 171 forks source link

Issue with Vault HA setup #31

Open VinothChinnadurai opened 4 years ago

VinothChinnadurai commented 4 years ago

I did a setup of vault + consul HA setup using the following repos on my AWS. https://github.com/hashicorp/vault-helm https://github.com/hashicorp/consul-helm Steps followed:

  1. Installed consul helm install consul ./consul-helm/
  2. Installed Vault helm install vault -f values-eks.yaml ./vault-helm/ values-eks.yaml contains

    cat >~/vault-eks/values-eks.yaml <<EOL
    server:
    ha:
    enabled: true
    config: |
      ui = true
    
      listener "tcp" {
        tls_disable = 1
        address = "[::]:8200"
        cluster_address = "[::]:8201"
      }
    
      seal "awskms" {
        region     = "us-east-1"
        kms_key_id = "xxx"
      }
    
      storage "consul" {
        path = "vault"
        address = "HOST_IP:8500"
      }
    EOL

Above steps booted 3 vault pods and 3 consul pods along with 3 consul servers spread evenly across 3 nodes.

  1. Initialised vault kubectl exec -it vault-0 -- vault operator init

  2. Unseal kubectl exec -it vault-0 -- vault operator unseal <unsealkey from step 3>

Modified Vault svc type to loadbalancer and got an ELB URL

  1. Installed vault binary in another machine and set these things

    export VAULT_ADDR='http://elbdomain:8200'
    export VAULT_SA_NAME=$(kubectl get sa vault -o jsonpath="{.secrets[*]['name']}")
    export VAULT_TOKEN="my initial token"
    export SA_JWT_TOKEN=$(kubectl get secret $VAULT_SA_NAME -o jsonpath="{.data.token}" | base64 --decode; echo)
    export SA_CA_CRT=$(kubectl get secret $VAULT_SA_NAME -o jsonpath="{.data['ca\.crt']}" | base64 --decode; echo)
  2. Enabled Kubernetes auth mode

    vault auth enable kubernetes
    vault write auth/kubernetes/config token_reviewer_jwt="$SA_JWT_TOKEN" kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443" kubernetes_ca_cert="$SA_CA_CRT"
  3. Create a Policy & Role:

    
    cat <<EOF > ./app-policy.hcl
    path "secret*" {
    capabilities = ["read"]
    }
    EOF

vault policy write app ./app-policy.hcl

vault write auth/kubernetes/role/myapp \ bound_service_account_names=app \ bound_service_account_namespaces=demo \ policies=app \ ttl=24h

vault secrets enable -path=secret/ kv vault kv put secret/helloworld username=foobaruser password=foobarbazpass


In my app deployment I added these annotations:

spec: template: metadata: annotations: vault.hashicorp.com/agent-inject: "true" vault.hashicorp.com/agent-inject-status: "update" vault.hashicorp.com/agent-inject-secret-helloworld: "secret/helloworld" vault.hashicorp.com/role: "myapp"

I can able to view the secrets are mounted inside properly.

In order to check my HA, i thought to drain a particular node and check it. 
kubectl drain ip-xxx.ec2.internal --ignore-daemonsets --delete-local-data

Post then my application went to Init status

NAME READY STATUS RESTARTS AGE
app-864b96c9b6-b7lxw 0/2 Init:0/1 0 41h



I tried to bring back the node using uncordon command but still my application pod showing the same status.
Please let me know what might be the problem here and how i can attain maximum HA say if a particular zone itself gone and cause these type of problem
jasonodonnell commented 4 years ago

Hi @VinothChinnadurai, the Vault Agent is blocking until it successfully authenticates and currently has no way of timing out. We are looking at adding a timeout configurable to Vault agent (in the Vault project) to make this possible.

VinothChinnadurai commented 4 years ago

@jasonodonnell you mean at init container level the vault agent is trying to authenticate right? we already initialized the vault server once with the initial token. do we need to redo it once again?

jasonodonnell commented 4 years ago

@VinothChinnadurai Did you unseal Vault when you brought it back up?

The init container that was injected into your pod will block until it can successfully connect with Vault.

VinothChinnadurai commented 4 years ago

@jasonodonnell No, I felt it is not a needed one as we already did and the cluster is fully functional. I just tried to replicate say some server went down and brought back will automatically work without this need of initialization again. Please clarify

jasonodonnell commented 4 years ago

@VinothChinnadurai Vault must be unsealed if it goes down (this is different from initialization, which only happens once). If you aren't using Vault auto-unseal, then Vault is not going to be unsealed.

AlyRagab commented 4 years ago

what will be the result of "vault status " command ? also check the logs of the vault pods ?

itisevgeny commented 4 years ago

@VinothChinnadurai Without auto-unseal, you can leverage k8s secrets to store unseal keys and the postStart handler of the vault's container lifecycle to execute unseal operation. But if your k8s secrets are not properly encrypted, that would be a security hole.

webmutation commented 4 years ago

@itisevgeny do you have more information regarding using K8S secrets to allow auto-unseal? I am looking for a simple way to unseal when the pod gets rescheduled. But have not found one that is properly contained.

itisevgeny commented 4 years ago

@webmutation You can store unseal key(s) in k8s secret(s). Then create patch

spec:
  template:
    spec:
      volumes:
      - name: auto-unseal
        secret:
          secretName: unseal-key
      containers:
      - name: vault
        volumeMounts:
        - name: auto-unseal
          readOnly: true
          mountPath: "/vault/config/unseal"
        lifecycle:
          postStart:
            exec:
              command: ["/bin/sh", "-c", "KEY=$(cat /vault/config/unseal/key); vault operator unseal $KEY;"]

And apply that patch for vault statefulset.

webmutation commented 4 years ago

Hi @itisevgeny Thanks for the tip, however do you think this is a safe approach? Probably could have limited access to the pod... i will definitely consider it.

I am currently trying to create a day 2 vault service, where I would use transit secrets to unlock the other vaults, , however the main keeper of vaults (Vault1) would need also an unseal mechanism and i would like to not rely on cloud provider KMS systems.

vault-autounseal-12

itisevgeny commented 4 years ago

@webmutation I would say that the approach is not completely safe because k8s secrets are plain text by default. But there are options to achieve some level of safety that would be applicable. E.g. move Vault1 to separate cluster or configure encryption of k8s secrets, ...