hashicorp / vault-helm

Helm chart to install Vault and other associated components.
Mozilla Public License 2.0
1.05k stars 868 forks source link

Agent Injector on EKS is not working. #989

Closed alifiroozi80 closed 5 months ago

alifiroozi80 commented 5 months ago

Hello Folks I've installed Vault 1.15.2 in my EKS cluster. The K8s cluster version is 1.28.X.

I've enabled the Vault Agent Injector, but it's not working in. the EKS cluster! The exact configuration works on a bare-metal cluster, but not on an EKS one!

apiVersion: apps/v1
kind: Deployment
metadata:
  name: basic-secret
  namespace: staging
  labels:
    app: basic-secret
spec:
  selector:
    matchLabels:
      app: basic-secret
  replicas: 1
  template:
    metadata:
      annotations:
        vault.hashicorp.com/log-level: "debug"
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/tls-skip-verify: "true"
        vault.hashicorp.com/agent-inject-status: "update"
        vault.hashicorp.com/role: "hh-staging"
        vault.hashicorp.com/agent-inject-secret-test: "secret/company/test"
        vault.hashicorp.com/agent-inject-template-test: |
          {{- with secret "secret/company/test" -}}
          {
            "username" : "{{ .Data.username }}",
            "password" : "{{ .Data.password }}"
          }
          {{- end }}
      labels:
        app: basic-secret
    spec:
      serviceAccountName: hh-staging
      containers:
      - name: app
        image: jweissig/app:0.0.1

But when it's being created, there is no sidecar!

$ k -n  staging get po -l app=basic-secret -w
NAME                           READY   STATUS    RESTARTS   AGE
basic-secret-9d4df55d4-smljt   0/1     Pending   0          0s
basic-secret-9d4df55d4-smljt   0/1     Pending   0          0s
basic-secret-9d4df55d4-smljt   0/1     ContainerCreating   0          0s
basic-secret-9d4df55d4-smljt   1/1     Running             0          1s

Here are my Vault installation values:

global:
  enabled: true
  tlsDisable: true

injector:
  enabled: true

server:

  dataStorage:
    enabled: true
    size: 5Gi
    mountPath: "/vault/data"
    storageClass: efs-sc 
    accessMode: ReadWriteOnce

  dev:
    enabled: false
  standalone:
    enabled: false
  affinity: ""
  ha:
    enabled: true
    replicas: 3
    # Enables Raft integrated storage
    raft:
      enabled: true
      setNodeId: true
      config: |
        ui = true

        listener "tcp" {
          tls_disable = 1

          address = "[::]:8200"
          cluster_address = "[::]:8201"
          # Enable unauthenticated metrics access (necessary for Prometheus Operator)
          telemetry {
            unauthenticated_metrics_access = "true"
          }
        }

        storage "raft" {
          path = "/vault/data"
        }

        service_registration "kubernetes" {}

ui:
  enabled: true

csi:
  enabled: false

Here is the vault-agent-injector log:

$ k -n vault logs vault-agent-injector-55748c487f-q2c6s
2024-01-08T13:37:54.372Z [INFO]  handler.auto-tls: Generated CA
2024-01-08T13:37:54.377Z [INFO]  handler: Starting handler..
Listening on ":8080"...
2024-01-08T13:37:54.472Z [INFO]  handler.certwatcher: Updated certificate bundle received. Updating certs...
2024-01-08T13:37:54.481Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...
2024-01-08T13:37:54.487Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...
2024-01-08T13:37:54.487Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...
2024-01-08T13:37:54.487Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...
2024-01-08T13:37:54.487Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...
2024-01-08T13:37:54.487Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...
2024-01-08T13:37:54.488Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...
2024-01-08T13:40:36.487Z [INFO]  handler.certwatcher: Webhooks changed. Updating certs...

Note 1: Again: the exact config and file Are perfectly working with another self-hosted K8s cluster Note 2: I've already searched, and something similar to my problem exists on GKE, and there, you have to open up a couple of ports. On EKS, everything should work as expected without any further steps, but it's not.

I appreciate any help.

tvoran commented 5 months ago

Hi @alifiroozi80, we've run into this on EKS before too, and it turned out to be a connectivity issue (similar to what you mentioned about GKE). Finding the correct security group to modify can be tricky, though if you're using the EKS module for terraform, adding something like this might help for opening port 8080 from the k8s API server to the nodes:

  node_security_group_additional_rules = {
    ingress_vault_injector_webhook = {
      description                   = "Access to Vault Agent Injector webhook endpoint from API server"
      protocol                      = "tcp"
      from_port                     = 8080
      to_port                       = 8080
      type                          = "ingress"
      source_cluster_security_group = true
    }
  }
alifiroozi80 commented 5 months ago

Hello @tvoran That was precisely the case! Thanks, man! I indeed use Terraform, and after adding your block to my Terraform scripts, everything is now working as expected. Thanks again🍻

raypet-cillco commented 4 months ago

I was fighting this for an entire day. Thanks a bunch @tvoran !

yunusemrecatalcam commented 4 months ago

saved my day, thanks!