hashicorp / vault-k8s

First-class support for Vault and Kubernetes.
Mozilla Public License 2.0
788 stars 168 forks source link

Multi Cluster K8S environment: App and Vault are not on same cluster; Demo app is not fetching secrets. Code 500 #62

Open vispatster opened 4 years ago

vispatster commented 4 years ago

Cluster A = Consul + Vault + Vault Injector Cluster B = Vault Injector communicated with Vault installed in Cluster A

I have consul+vault installed on one Kubernetes cluster. On the other cluster, the vault-k8s injector has been installed successfully. - https://github.com/hashicorp/vault-k8s.git

Init pod returns the following errors (Error making API request. Code 500). The vault address has been changed to http://vlt.consulvault.172.31.101.63.xip.io. Both the clusters are in the same network and curl command returns the response. I think I may have to pass the root token(or register secret with Vault) in order to authenticate.

To authenticate, I have applied the configuration as follows, but I don't know how I can point injector(cluster B) from http://vlt.consulvault.172.31.101.63.xip.io/v1/auth/kubernetes/login to http://vlt.consulvault.172.31.101.63.xip.io/v1/auth/kube-cluster-A

vault write auth/kube-cluster-A/config \ token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \ kubernetes_host=https://${KUBERNETES_PORT_443_TCP_ADDR}:443 \ kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Is there any way to assign the path to Vault injector on cluster B?

Thank you

curl \
>     -H "X-Vault-Token: token" \
>     -X GET \
>     http://vlt.consulvault.172.31.101.63.xip.io/v1/secret/helloworld
{"request_id":"***","lease_id":"","renewable":false,"lease_duration":2764800,"data":{"password":"foobarbazpass","username":"foobaruser"},"wrap_info":null,"warnings":null,"auth":null}

==> Vault server started! Log data will stream in below:
2020-01-28T18:27:10.042Z [INFO] sink.file: creating file sink
2020-01-28T18:27:10.042Z [INFO] sink.file: file sink configured: path=/home/vault/.token mode=-rw-r-----
==> Vault agent configuration:
Cgo: disabled
Log Level: info
Version: Vault v1.3.1
2020-01-28T18:27:10.042Z [INFO] auth.handler: starting auth handler
2020-01-28T18:27:10.042Z [INFO] auth.handler: authenticating
2020-01-28T18:27:10.042Z [INFO] template.server: starting template server
2020/01/28 18:27:10.042917 [INFO] (runner) creating new runner (dry: false, once: false)
2020/01/28 18:27:10.043280 [INFO] (runner) creating watcher
2020-01-28T18:27:10.043Z [INFO] sink.server: starting sink server
2020-01-28T18:27:10.275Z [ERROR] auth.handler: error authenticating: error="Error making API request.
URL: PUT http://vlt.consulvault.172.31.101.63.xip.io/v1/auth/kubernetes/login
Code: 500. Errors:

* lookup failed: [invalid bearer token, square/go-jose: error in cryptographic primitive]" backoff=2.382848811
2020-01-28T20:21:34.792Z [INFO] auth.handler: authenticating
2020-01-28T20:21:34.806Z [ERROR] auth.handler: error authenticating: error="Error making API request.
URL: PUT http://vlt.consulvault.172.31.101.63.xip.io/v1/auth/kubernetes/login
Code: 500. Errors:
* lookup failed: [invalid bearer token, square/go-jose: error in cryptographic primitive]" backoff=1.805414534
2020-01-28T20:21:36.612Z [INFO] auth.handler: authenticating
2020-01-28T20:21:36.758Z [ERROR] auth.handler: error authenticating: error="Put http://vlt.consulvault.172.31.101.63.xip.io/v1/auth/kubernetes/login: dial tcp: lookup vlt.consulvault.172.31.101.63.xip.io on 10.43.0.10:53: no such host" backoff=1.998397118
2020-01-28T20:21:38.757Z [INFO] auth.handler: authenticating
2020-01-28T20:21:38.761Z [ERROR] auth.handler: error authenticating: error="Put http://vlt.consulvault.172.31.101.63.xip.io/v1/auth/kubernetes/login: dial tcp: lookup vlt.consulvault.172.31.101.63.xip.io on 10.43.0.10:53: no such host" backoff=2.780268301
2020-01-28T20:21:41.541Z [INFO] auth.handler: authenticating

Vault Injector YAML

---
# Source: vault/templates/injector-deployment.yaml
# Deployment for the injector
apiVersion: apps/v1
kind: Deployment
metadata:
  name: vault-agent-injector
  namespace: consulvault
  labels:
    app.kubernetes.io/name: vault-agent-injector
    app.kubernetes.io/instance: vault
    app.kubernetes.io/managed-by: Helm
    component: webhook
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: vault-agent-injector
      app.kubernetes.io/instance: vault
      component: webhook
  template:
    metadata:
      labels:
        app.kubernetes.io/name: vault-agent-injector
        app.kubernetes.io/instance: vault
        component: webhook
    spec:
      serviceAccountName: "vault-agent-injector"
      securityContext:
        runAsNonRoot: true
        runAsGroup: 1000
        runAsUser: 100
      containers:
        - name: sidecar-injector

          image: "hashicorp/vault-k8s:0.1.0"
          imagePullPolicy: "IfNotPresent"
          env:
            - name: AGENT_INJECT_LISTEN
              value: ":8080"
            - name: AGENT_INJECT_LOG_LEVEL
              value: info
            - name: AGENT_INJECT_VAULT_ADDR
              value: http://vlt.consulvault.172.31.101.63.xip.io
            - name: AGENT_INJECT_VAULT_IMAGE
              value: "vault:1.3.1"
            - name: AGENT_INJECT_TLS_AUTO
              value: vault-agent-injector-cfg
            - name: AGENT_INJECT_TLS_AUTO_HOSTS
              value: vault-agent-injector-svc,vault-agent-injector-svc.consulvault,vault-agent-injector-svc.consulvault.svc
          args:
            - agent-inject
            - 2>&1
          livenessProbe:
            httpGet:
              path: /health/ready
              port: 8080
              scheme: HTTPS
            failureThreshold: 2
            initialDelaySeconds: 1
            periodSeconds: 2
            successThreshold: 1
            timeoutSeconds: 5
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
              scheme: HTTPS
            failureThreshold: 2
            initialDelaySeconds: 2
            periodSeconds: 2
            successThreshold: 1
            timeoutSeconds: 5
---
stevegore commented 4 years ago

I'm trying to do pretty much the same thing as you. Your issue may be DNS-related?

lookup vlt.consulvault.172.31.101.63.xip.io on 10.43.0.10:53: no such host

vispatster commented 4 years ago

I'm trying to do pretty much the same thing as you. Your issue may be DNS-related?

lookup vlt.consulvault.172.31.101.63.xip.io on 10.43.0.10:53: no such host

I saw that error in the log. I am trying to find a way to troubleshoot. From the node, I am able to communicate with both the clusters. If Cluster B can not send the request out, it means the CNI plugin is restricting it or some kind of firewall. I will update here if I can find something useful.

Thank you for your observation.

stevegore commented 4 years ago

I have this working. A few things:

VAULT_SA_NAME=vault-auth

# Set VAULT_SA_SECRET to the service account you created earlier
export VAULT_SA_SECRET=$(kubectl -n test get sa $VAULT_SA_NAME -o jsonpath="{.secrets[*]['name']}")
echo "Account secret name is $VAULT_SA_SECRET"

# Set SA_JWT_TOKEN value to the service account JWT used to access the TokenReview API
export VAULT_SA_JWT_TOKEN=$(kubectl -n test get secret $VAULT_SA_SECRET -o jsonpath="{.data.token}" | base64 --decode; echo)
echo "JWT is $VAULT_SA_JWT_TOKEN"

# Set SA_CA_CRT to the PEM encoded CA cert used to talk to Kubernetes API
export VAULT_SA_CA_CRT=$(kubectl -n test get secret $VAULT_SA_SECRET -o jsonpath="{.data['ca\.crt']}" | base64 --decode; echo)
echo "Cert is $VAULT_SA_CA_CRT"

# Set K8S_CONTEXT to name of current context
export K8S_CONTEXT=$(kubectl config current-context)
echo "Context is $K8S_CONTEXT"

# Set K8S_HOST to server of current context
export K8S_HOST=$(kubectl config view -o jsonpath="{.clusters[?(@.name == \"$K8S_CONTEXT\")].cluster.server}"; echo)
echo "Host is $K8S_HOST"

# Enable Kubernetes Auth
vault auth enable --path kubernetes kubernetes

# Tell Vault how to communicate with the cluster
vault write auth/kubernetes/config \
    token_reviewer_jwt="$VAULT_SA_JWT_TOKEN" \
    kubernetes_host="$K8S_HOST" \
    kubernetes_ca_cert="$VAULT_SA_CA_CRT"

There's no mention in the docs of still requiring a vault-auth service account for Kubernetes auth, but I'm not sure how it's meant to work otherwise. Perhaps someone else can confirm/deny that it's required?

Edit: Turns out the vault-auth can be in any namespace (as you'd expected from a ClusterRole), not sure what that wasn't working for me before.

vispatster commented 4 years ago

I have this working. A few things:

  • I have two service accounts on the application cluster (Cluster B):

    • vault-auth required for kubernetes-auth. It's in the application namespace (e.g. test)
    • vault-injector to manage the webhooks, admissions controllers etc. It's in the vault namespace (e.g. vault)
  • I have kubernetes auth configured in the Vault cluster (Cluster A) at the default path (/auth/kubernetes). As per this comment using a different auth path requires setting a ConfigMap - looking at doing that next
  • Kubernetes auth is configured to use the vault-auth service account from Cluster B:

There's no mention in the docs of still requiring a vault-auth service account for Kubernetes auth, but I'm not sure how it's meant to work otherwise. Perhaps someone else can confirm/deny that it's required?

It's just an assumption. At some point, this information needs to verify with the Vault cluster. This would help the Vault cluster to determine the correct K8s cluster.

vault write auth/kubernetes/config \ token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \ kubernetes_host=https://${KUBERNETES_PORT_443_TCP_ADDR}:443 \ kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Technically, we would not able to use annotation way comment unless the vault cluster and application are on the same k8s cluster

As you have mentioned, we still have to mount custom config files SideCar sh. I am looking for a way to define the path in the config file. The official document shows how to define the path with "vault auth" and "vault login" commands.

sbeaulie commented 4 years ago

I followed @stevegore's and deployed the vault injector only in the cluster with the app pods. Then I used the SA in that cluster to configure the vault auth in the other cluster.

I'm getting a different error in GKE:


URL: PUT https://<my_server>/v1/auth/kubernetes/login
Code: 500. Errors:

* Post https://kubernetes.default.svc/apis/authentication.k8s.io/v1/tokenreviews: x509: certificate signed by unknown authority" backoff=1.734589217

do I need something else on the injector-only cluster to use the tls enabled vault server in the other cluster?

I noticed I could add this annotation to my app pods but I'm not sure where this kubernetes resource need to exist: vault.hashicorp.com/tls-secret: ""

Looking at the deploy files for the injector-only configuration this secret is not created: https://github.com/hashicorp/vault-k8s/tree/master/deploy

stevegore commented 4 years ago

It looks to me that your Vault cluster isn't able to talk to your app cluster, which would indicate that your Kubernetes Auth isn't fully configured. This line here is meant to store the certificate in Vault config:

vault write auth/kubernetes/config \
    token_reviewer_jwt="$VAULT_SA_JWT_TOKEN" \
    kubernetes_host="$K8S_HOST" \
    kubernetes_ca_cert="$VAULT_SA_CA_CRT"

I had this little script to check K8s Auth from my laptop. Note I've configured the k8s auth endpoint at auth/kubernetes/xxx:

#!/bin/bash
set -e

VAULT_SA_NAME=temp

if ! kubectl get sa | grep $VAULT_SA_NAME; then
    kubectl -n default create sa $VAULT_SA_NAME
fi

# Set VAULT_SA_SECRET to the secret containing service account credentials
VAULT_SA_SECRET=$(kubectl -n default get sa $VAULT_SA_NAME -o jsonpath="{.secrets[*]['name']}")
echo "Account secret name is $VAULT_SA_SECRET"

# Set SA_JWT_TOKEN value to the the JWT we will validate
VAULT_SA_JWT_TOKEN=$(kubectl -n default get secret $VAULT_SA_SECRET -o jsonpath="{.data.token}" | base64 --decode; echo)
echo "JWT is $VAULT_SA_JWT_TOKEN"

# Set K8S_CONTEXT to name of current context
K8S_CONTEXT=$(kubectl config current-context)
echo "Context is $K8S_CONTEXT"

curl \
    --request POST \
    --data "{\"jwt\": \"$VAULT_SA_JWT_TOKEN\", \"role\": \"test\"}" \
    -s https://vault.q-ctrl.com/v1/auth/kubernetes/$K8S_CONTEXT/login | jq
sbeaulie commented 4 years ago

Thanks @stevegore for the script. I confirmed the kubernetes auth config is right by reading from the vault cli

vault read auth/kubernetes/config

I can see that the kubernetes_ca_cert is there (and refer to the secret in the app cluster) and the kubernetes_host is set properly.

I modified your script a little to account for my auth endpoint, namespace and role, but the curl also returns the same error

  "errors": [
    "Post https://<my host>/apis/authentication.k8s.io/v1/tokenreviews: x509: certificate signed by unknown authority"
  ]
}

If I configure the kubernetes auth from the cluster that has vault installed, I'm able to use it. But as soon as I change to use the app cluster's service account that's when it starts failing. I'm still waiting on the official documentation from Hashicorp on how to do the dual cluster integration, because maybe there was something missed there...

stevegore commented 4 years ago

Have you tried calling that endpoint directly to see what certificate you're getting? FYI the equivalent call to the /tokenreviews endpoint would look something like this:

curl -iv --location --request POST 'https://<my host>/apis/authentication.k8s.io/v1/tokenreviews' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer insertjwtfromserviceaccounthere' \
--data-raw '{
     "kind": "TokenReview",
     "apiVersion": "authentication.k8s.io/v1",
     "metadata":{
         "name": "sample"
     },
     "spec": {
        "token": "inserttokentovalidatehere"
     }
 }'

But I agree that more documentation would be great.

sbeaulie commented 4 years ago

I think I know what was not working, I was setting the kubernetes auth mechanism in Vault with the kubernetes_host="$K8S_HOST" pointing to the k8 master servers in the cluster that has vault, and not the app cluster - I cannot get my K8S_HOST from the kubectl config view like you did because I'm running in GKE, and the control plane is managed by Google. I can connect to pods and they have an env var KUBERNETES_SERVICE_HOST that point to their control plane.

I had read here that a FW needed to be open for the webhook https://github.com/hashicorp/vault-k8s/issues/32#issuecomment-578121917 (I have done this step already) so maybe something similar needs to be open from the vault pods --> k8s masters in the other cluster? I haven't seen anyone mention this port 443 needed to be opened between the two clusters, but that will be my next attempt. I checked every doc referencing the kubernetes auth setup, and it is very light with regards to multi cluster setups, most just say kubernetes_host="$K8S_HOST" without specifying what that means.

In summary:

stevegore commented 4 years ago

Interesting. FWIW I'm also running this on GKE. This is still in early stages so we haven't yet hardened our cluster with private IPs for the Kubernetes API, which could be where things differ. Our API has a public IP with no firewall restrictions, just authentication.

Not sure if this is helpful to you, but if you using GKE, you can also go to the console to get the Endpoint and cluster CA certificate.

screenshot

Kampe commented 4 years ago

I'm seeing this same issue with my remote cluster. How'd you get around it?
x509: certificate signed by unknown authority"

msenmurugan commented 3 years ago

@stevegore @sbeaulie @Kampe

I'm also using GKE private cluster. Can you please elaborate about the steps that you have did for setting up another cluster (app cluster) to talk with Vault?

I have set it up the app cluster k8s auth in Vault using following commands export KUBE_CA_CERT=kubectl config view --raw --minify --flatten -o jsonpath='{.clusters[].cluster.certificate-authority-data}' | base64 --decode export KUBE_HOST=kubectl config view --raw --minify --flatten -o jsonpath='{.clusters[].cluster.server}'

vault write auth/gke2/config \ token_reviewer_jwt="$SA_JWT_TOKEN" \ kubernetes_host="$KUBE_HOST" \ kubernetes_ca_cert="$KUBE_CA_CERT";

Used the vault annotation for app pod --> vault.hashicorp.com/auth-path: "auth/gke2"

Now I'm getting the below issue. 2021-06-16T05:26:01.712Z [INFO] auth.handler: authenticating 2021-06-16T05:27:01.713Z [ERROR] auth.handler: error authenticating: error="context deadline exceeded" backoff=2m34.65s

Can you please provide some pointers on how to fix it?

jghal commented 3 years ago

I am in AWS with an EKS cluster for devwebapp-with-annotations and a K3s single-node cluster on an EC2 instance running vault server. I was seeing same error as @msenmurugan, but then noticed that the devwebapp-with-annotations POD has an istio sidecar, and found in #41 to annotate it with vault.hashicorp.com/agent-init-first: "true". That resolved the auth.handler error for me.