hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
31.34k stars 4.23k forks source link

Failed TLS Handshake with Vault in Kubernetes Helm Chart #22111

Open MikeK184 opened 1 year ago

MikeK184 commented 1 year ago

I have deployed Vault with HA using Raft with AWS KMS successfully, now I wanted to implement TLS for internal communication however after configuring the Helm chart I receive such errors:

2023-07-28T14:23:24.226Z [INFO]  http: TLS handshake error from 10.42.189.46:37944: remote error: tls: bad certificate
2023-07-28T14:23:24.226Z [ERROR] core: failed to get raft challenge: leader_addr=https://vault-2.vault-internal:8200 error="error during raft bootstrap init call: Put \"https://vault-2.vault-internal:8200/v1/sys/storage/raft/bootstrap/challenge\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
2023-07-28T14:23:24.314Z [ERROR] core: failed to get raft challenge: leader_addr=https://vault-0.vault-internal:8200 error="error during raft bootstrap init call: Put \"https://vault-0.vault-internal:8200/v1/sys/storage/raft/bootstrap/challenge\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
2023-07-28T14:23:24.314Z [ERROR] core: failed to retry join raft cluster: retry=2s err="failed to get raft challenge"
2023-07-28T14:23:24.393Z [INFO]  http: TLS handshake error from 10.42.56.249:47514: remote error: tls: bad certificate
2023-07-28T14:23:25.506Z [INFO]  http: TLS handshake error from 10.42.28.13:50416: remote error: tls: bad certificate
2023-07-28T14:23:26.131Z [INFO]  core: stored unseal keys supported, attempting fetch
2023-07-28T14:23:26.131Z [WARN]  failed to unseal core: error="stored unseal keys are supported, but none were found"
2023-07-28T14:23:26.315Z [INFO]  core: security barrier not initialized
2023-07-28T14:23:26.323Z [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault-0.vault-internal:8200
2023-07-28T14:23:26.324Z [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault-1.vault-internal:8200
2023-07-28T14:23:26.324Z [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault-2.vault-internal:8200

I used the following steps to create the necessary files:

# Generate the private key
openssl genrsa -out vault.key 2048

# Create the certificate signing request (CSR)
openssl req -new -key vault.key -out vault.csr -subj "/CN=vault-internal"

# Create the vault-csr.conf file with the correct extensions
cat <<EOF > vault-csr.conf
[req]
default_bits = 2048
prompt = no
encrypt_key = yes
default_md = sha256
distinguished_name = vault_internal
req_extensions = v3_req

[vault_internal]

[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth, clientAuth
subjectAltName = @alt_names

[alt_names]
DNS.1 = *.vault-internal
DNS.2 = *.vault-internal.vault
DNS.3 = *.vault-internal.vault.svc
DNS.4 = *.vault-internal.vault.svc.cluster.local
IP.1 = 127.0.0.1
EOF

# Create the certificate using the CSR and private key
openssl x509 -req -in vault.csr -signkey vault.key -out vault.crt -extensions v3_req -extfile vault-csr.conf

# Create key
kubectl create secret generic vault-ha-tls --namespace vault \
  --from-file=vault.key=vault.key \
  --from-file=vault.crt=vault.crt \
  --from-file=vault.ca=vault.crt

# Verify crt
openssl verify -CAfile vault.crt vault.crt
vault.crt: OK

And this is my values-overwrite.yaml file:

# Vault Helm Chart Value Overrides
global:
  enabled: true
  tlsDisable: false

injector:
  enabled: true

server:
  logLevel: "debug"
  dataStorage:
    enabled: true
    size: 4Gi
    mountPath: "/vault/data"
    storageClass: null
    accessMode: ReadWriteOnce
    annotations: {}

  auditStorage:
    enabled: true

  standalone:
    enabled: false

  #* TLS configuration
  extraEnvironmentVars:
    VAULT_CACERT: /vault/userconfig/vault-ha-tls/vault.ca
    VAULT_TLSCERT: /vault/userconfig/vault-ha-tls/vault.crt
    VAULT_TLSKEY: /vault/userconfig/vault-ha-tls/vault.key
  volumes:
      - name: userconfig-vault-ha-tls
        secret:
          defaultMode: 420
          secretName: vault-ha-tls
  volumeMounts:
    - mountPath: /vault/userconfig/vault-ha-tls
      name: userconfig-vault-ha-tls
      readOnly: true

  extraSecretEnvironmentVars:
    - envName: AWS_ACCESS_KEY_ID
      secretName: aws-creds
      secretKey: AWS_ACCESS_KEY_ID
    - envName: AWS_SECRET_ACCESS_KEY
      secretName: aws-creds
      secretKey: AWS_SECRET_ACCESS_KEY

  ha:
    enabled: true
    replicas: 3

    raft:
      enabled: true
      setNodeId: true
      config: |
        ui = true

        listener "tcp" {
          tls_disable = 0
          address = "[::]:8200"
          cluster_address = "[::]:8201"
          tls_cert_file = "/vault/userconfig/vault-ha-tls/vault.crt"
          tls_key_file  = "/vault/userconfig/vault-ha-tls/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault-ha-tls/vault.ca"
          telemetry {
            unauthenticated_metrics_access = "true"
          }
        }

        storage "raft" {
          path = "/vault/data"

          retry_join {
          leader_api_addr = "https://vault-0.vault-internal:8200"
          }
          retry_join {
          leader_api_addr = "https://vault-1.vault-internal:8200"
          }
          retry_join {
          leader_api_addr = "https://vault-2.vault-internal:8200"
          }
        }

        seal "awskms" {
          region     = "eu-central-1"
          kms_key_id = "9a62ea2c-92f4-4323-9a06-84c2e63dfe9b"
        }

        disable_mlock = true
        service_registration "kubernetes" {}

I've also tried going according to an offical guide: https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-minikube-tls however I can't even execute vault operator init because the container complains that the CA is from an unknown authoritiy straigt up, whereas with my steps, I can unseal the first pod, but the other 2 instances dont get auto unsealed and complain about the tls configuration.

maxb commented 1 year ago

This line shows a misunderstanding:

          tls_client_ca_file = "/vault/userconfig/vault-ha-tls/vault.ca"

You are not using client certificate authentication, so you should not set this. It does not break anything, but it misleads you and others reading the configuration into thinking you have set a relevant setting, when you have not.

    VAULT_CACERT: /vault/userconfig/vault-ha-tls/vault.ca

This environment variable is primarily intended for configuring a Vault client, not a server. It has one limited usage in connection with a Vault server - configuring the CA certificate to trust when connecting to another Vault being used as a transit auto-unseal method, but is otherwise unused. Since you are not using transit auto-unseal, you should remove this variable, again to avoid making an implication you are setting a needed setting, which is actually redundant.

    VAULT_TLSCERT: /vault/userconfig/vault-ha-tls/vault.crt
    VAULT_TLSKEY: /vault/userconfig/vault-ha-tls/vault.key

These environment variables are simply incorrect names that appear nowhere in the Vault source code.

What you actually need to be setting, is this: https://developer.hashicorp.com/vault/docs/configuration/storage/raft#leader_ca_cert_file

MikeK184 commented 1 year ago

I appreciate the quick response and the explanations, however... if you have a look here: https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-minikube-tls#deploy-the-vault-cluster-via-helm-with-overrides the two environment variables are set there, I obviously have no idea what fields are actually relevant but one would assume that stuff you find on the official website are valid...

I'll give the link you provided a try though!

maxb commented 1 year ago

one would assume that stuff you find on the official website are valid

If only that were true. In my experience the quality of those tutorials is pretty patchy. And there's not even a way to propose edits to them.

maxb commented 1 year ago

Actually, let me tag @hsimon-hashicorp who may be able to route this feedback somewhere so that it does some good :-)

heatherezell commented 1 year ago

And subsequently, I'll tag in our Education team - @yhyakuna and @schavis :)

tuxillo commented 1 year ago

has the documentation been updated? what was the outcome? i'm having exactly the same issue as @MikeK184 had.

mohamedmshokry commented 6 months ago

@MikeK184 did you manage to get the root cause or fix the CA issue?