hashicorp / vault-helm

Helm chart to install Vault and other associated components.
Mozilla Public License 2.0
1.08k stars 874 forks source link

Feature request: Let's Encrypt for Vault itself #385

Open glerchundi opened 4 years ago

glerchundi commented 4 years ago

Is your feature request related to a problem? Please describe.

In case Vault is secured end-to-end with self signed certificates (which seems to be the most common way of deploying it), anyone who is going to access to Vault needs to have the CA pubkey to verify the authenticity of the certificate & avoid MiTM attacks.

This requires to have a method to provision this CA in a secure way, which is not completely trivial. It can be done by putting it in a place where it is already trusted/verified like the company website: https://mycompany.com/ca.pem.

Describe the solution you'd like

Ideally I would like to avoid the people administrating Vault the hassle of trusting this CA. I don't know if it's even possible but here it is my proposal:

Describe alternatives you've considered

Providing a custom CA cert out of band.

glerchundi commented 4 years ago

Coming from https://github.com/hashicorp/vault/issues/9711.

fbongiovanni29 commented 4 years ago

Would really like to have this feature or maybe a guide on how to do this

alwaysastudent commented 3 years ago

Would like to do this outside of K8s, on a baremetal standalone cluster.

Chili-Man commented 3 years ago

this would be great; have any of you guys tried doing this but with cert-manager to manage the Let's Encrypt secrets?

iuriaranda commented 3 years ago

this would be great; have any of you guys tried doing this but with cert-manager to manage the Let's Encrypt secrets?

We're using Vault this way actually, and it works well. The only problem we have is that when cert-manager renews the certificate, we have no way of notifying Vault so it reloads it from the secret volume mount. We're looking for a clean solution to accomplish that atm. Any ideas?

glerchundi commented 3 years ago

@iuriaranda would you mind adding more info.?

Thanks ;)

glerchundi commented 3 years ago

I have been thinking about this today and come up with a possible solution that could lead to an auto-managed deployment of Vault with Let's Encrypt:

Although the paper can take whatever you write on it, I'm pretty confident it could work.

I'm a little bit worried about the shareProcessNamespace and its security implications though.

WDYT? /cc @jasonodonnell

iuriaranda commented 3 years ago

@iuriaranda would you mind adding more info.?

  • What challenging method are you using?
  • If it's not the DNS based one, how do you solve the proxying problem with HTTP/TLS, using an Envoy/Nginx in front of Vault?
  • How do you make the distinction between internal traffic and the external one? I mean, did you needed to add a specific/static IP address to the config like the way I explained?

Thanks ;)

Sorry for the late response, I missed your comment.

We're using the DNS challenge with cert-manager. For some of our setups, we still have a public LB in front of Vault to allow for external connections though.

External traffic goes through the load balancer, and cluster workloads can still reach Vault via the k8s service. We don't configure any static IP, Vault is deployed with the default listener, which afaik listens to 0.0.0.0

glerchundi commented 3 years ago

Thanks for your response @iuriaranda.

I assume that you're not using Integrated Storage, right? Because probably this is happening to us because we want to have both at the same time: lets encrypt protected public addr & tls protected internal comm.

Why two different certs? Because lets encrypt only provides the certificate to be used with our corp domain: vault.corpdomain.com. The internal TLS instead includes all the required SANs: vault-0.vault-internal, vault-1.vault-internal, ...

So that we can define our config as follows through the ha.raft.config parameter:

listener "tcp" {
  address                  = "0.0.0.0:8200"
  tls_key_file             = "/vault/userconfig/vault-tls/vault-key.pem"
  tls_cert_file            = "/vault/userconfig/vault-tls/vault.pem"
  tls_client_ca_file       = "/vault/userconfig/vault-tls/ca.pem"
  tls_disable_client_certs = true
}
storage "raft" {
  path = "/vault/data"
  retry_join {
    leader_api_addr         = "https://vault-0.vault-internal:8200"
    leader_client_key_file  = "/vault/userconfig/vault-tls/vault-key.pem"
    leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
    leader_ca_cert_file     = "/vault/userconfig/vault-tls/ca.pem"
  }
  retry_join {
    leader_api_addr         = "https://vault-1.vault-internal:8200"
    leader_client_key_file  = "/vault/userconfig/vault-tls/vault-key.pem"
    leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
    leader_ca_cert_file     = "/vault/userconfig/vault-tls/ca.pem"
  }
  retry_join {
    leader_api_addr         = "https://vault-2.vault-internal:8200"
    leader_client_key_file  = "/vault/userconfig/vault-tls/vault-key.pem"
    leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
    leader_ca_cert_file     = "/vault/userconfig/vault-tls/ca.pem"
  }
  retry_join {
    leader_api_addr         = "https://vault-3.vault-internal:8200"
    leader_client_key_file  = "/vault/userconfig/vault-tls/vault-key.pem"
    leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
    leader_ca_cert_file     = "/vault/userconfig/vault-tls/ca.pem"
  }
  retry_join {
    leader_api_addr         = "https://vault-4.vault-internal:8200"
    leader_client_key_file  = "/vault/userconfig/vault-tls/vault-key.pem"
    leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
    leader_ca_cert_file     = "/vault/userconfig/vault-tls/ca.pem"
  }
}

The idea would be to create another listener stanza just for the sake of listening on it for public requests, for example:

listener "tcp" {
  address                  = "0.0.0.0:8202"
  tls_key_file             = "/vault/userconfig/vault-lets-encrypt/vault-key.pem"
  tls_cert_file            = "/vault/userconfig/vault-lets-encrypt/vault.pem"
  tls_disable_client_certs = true
}

Note 8202 being used as listening port and using Let's Encrypt provided certificates.

Then configure the ui-service.yaml to have a custom targetPort and different to the default 8200, which is hardcoded: https://github.com/hashicorp/vault-helm/blob/master/templates/ui-service.yaml#L28

WDYT @jasonodonnell, would you be open to a PR to customize that ui-service.yaml targetPort through another values.yaml parameter?

/cc @dcanadillas

glerchundi commented 3 years ago

In an ideal world, this workaround code I proposed here (cert-manager-csi + inotify) could make it (somehow) into the Vault core in the same vein it does with service_registration "kubernetes" {}.

I'm envisioning something like this:

listener "tcp" {
  address            = "0.0.0.0:8202"
  tls_key_k8s_secret = "vault-lets-encrypt"
}
glerchundi commented 3 years ago

I created an issue in Vault for a feature request I feel the core team could be willing to accept: https://github.com/hashicorp/vault/issues/10615

jamesgoodhouse commented 2 years ago

this would be great; have any of you guys tried doing this but with cert-manager to manage the Let's Encrypt secrets?

We're using Vault this way actually, and it works well. The only problem we have is that when cert-manager renews the certificate, we have no way of notifying Vault so it reloads it from the secret volume mount. We're looking for a clean solution to accomplish that atm. Any ideas?

Something like the following would work:

#!/bin/sh

set -eu

cert_checksum="$(sha256sum /tls/tls.crt | awk '{ print $1 }')"
echo "$(date -u) — Current checksum '$cert_checksum'"

inotifywait -q -m /tls |
  while read -r path action file; do
    if [ "$file" = "tls.crt" ]; then
      new_cert_checksum="$(sha256sum /tls/tls.crt | awk '{ print $1 }')"

      if [ "$cert_checksum" != "$new_cert_checksum" ]; then
        echo "$(date -u) — New checksum '$new_cert_checksum'"
        cert_checksum="$new_cert_checksum"
        vault_pid=$(cat /pids/vault_server.pid)
        echo "$(date -u) — Sending SIGHUP signal to Vault (pid=$vault_pid)"
        kill -SIGHUP "$vault_pid"
      fi
    fi
  done
flyte commented 2 years ago

Something like the following would work

I've made a container image based on this pattern, but I removed inotifywait because it was triggering constantly on OPEN, ACCESS, CLOSE_NOWRITE and CLOSE, but didn't actually trigger on modification events (something to do with secrets mounting I suppose). This caused the sha256sum call to be running almost constantly.

This version just checks the sha256sum of the cert each minute and reloads vault using killall instead of having to find the PID from somewhere.

#!/bin/sh

set -e

cert_path="$1"
if [ "$cert_path" = "" ]; then
  echo "Must include path to cert as first argument"
  exit 1
fi

set -u

cert_hash="$(sha256sum $cert_path | awk '{ print $1 }')"
echo "$(date -u) - Current checksum '$cert_hash'"

while [ 1 ]; do
  new_cert_hash="$(sha256sum $cert_path | awk '{ print $1 }')"
  if [ "$cert_hash" != "$new_cert_hash" ]; then
    echo "$(date -u) - New checksum '$new_cert_hash'"
    cert_hash="$new_cert_hash"
    echo "$(date -u) - Sending SIGHUP signal to Vault"
    killall -SIGHUP "vault"
  fi
  sleep 60
done

Here's how I've integrated it into the helm chart:

  extraContainers:
    - name: cert-watcher
      image: ghcr.io/flyte/docker-vault-cert-reloader:1.0.4
      args:
        - /var/run/secrets/vault-tls/tls.crt
      volumeMounts:
        - name: vault-tls
          mountPath: /var/run/secrets/vault-tls
          readOnly: true
  shareProcessNamespace: true

Git repo here: https://github.com/flyte/docker-vault-cert-reloader

ismferd commented 2 years ago

I tried the workaround from @flyte and it works as I expected. Thanks.

pksurferdad commented 2 years ago

@iuriaranda can you provide a bit more detail on how you accomplished this https://github.com/hashicorp/vault-helm/issues/385#issuecomment-749401401, particularly how you configured the cert request to cert manager? Are you using the same cert from cert manager for both internal (within the cluster) and external traffic? Or are you using 2 separate certs because of vault's SAN requirements? Do you mind sharing vault helm values file?

My issue is that i have a single vault cluster on a separate k8s/AWS EKS instance providing services to several / separate k8s/AWS EKS clusters and i need a signed cert for both external (UI) and internal (API) communications. The separate k8s clusters are using the vault-agent-injector to communicate with the vault instance over HTTPS (e.g. api.vault.example.com) and the UI is accessible from HTTPS as well (ui.vault.example.com). I'm currently using a self-signed cert but this does not work for the API in k8s 1.21 using short-lived tokens (https://www.vaultproject.io/docs/auth/kubernetes#kubernetes-1-21) failing with the error below.

x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"
pksurferdad commented 2 years ago

hi @glerchundi i see that your PR https://github.com/hashicorp/vault-helm/pull/437 got merged, were you able to get your suggested configuration https://github.com/hashicorp/vault-helm/issues/385#issuecomment-749560213 working?

pksurferdad commented 2 years ago

i was able to resolve my issue. for anyone coming across this, i created separate vault listeners for the vault UI and API and i'm using let's encrypt for the cert. i run k8s on AWS EKS and i manually edited the vault-active service to add the additional 8203 port so it can persist on the AWS NLB.

i haven't gotten this https://github.com/hashicorp/vault-helm/issues/385#issuecomment-1157883999 to work yet, but that's next. for the internal vault listener, i'm still using a self-signed generated cert. below is the HA config that is currently working.

  ha:
    enabled: true
    replicas: 3
    raft:
      enabled: true
      setNodeId: true
      config: |
        ui = true

        # listener for the vault cluster
        listener "tcp" {
          address = "[::]:8200"
          cluster_address = "[::]:8201"
          tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
          tls_key_file  = "/vault/userconfig/vault-server-tls/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
          tls_disable = "false"
          tls_disable_client_certs = "true"
          tls_require_and_verify_client_cert="false"
        }

        storage "raft" {
          path = "/vault/data"
            retry_join {
            leader_api_addr = "https://vault-0.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
          retry_join {
            leader_api_addr = "https://vault-1.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
          retry_join {
            leader_api_addr = "https://vault-2.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
          }
        }

        # external listener for the vault UI
        listener "tcp" {
          address                  = "0.0.0.0:8202"
          tls_key_file             = "/vault/userconfig/vault-server-tls-letsencrypt/tls.key"
          tls_cert_file            = "/vault/userconfig/vault-server-tls-letsencrypt/tls.crt"
          tls_disable = false
          tls_disable_client_certs = "true"
          tls_require_and_verify_client_cert="false"
        }

        # external listener for the vault API
        listener "tcp" {
          address = "[::]:8203"
          tls_cert_file = "/vault/userconfig/vault-server-tls-letsencrypt/tls.crt"
          tls_key_file  = "/vault/userconfig/vault-server-tls-letsencrypt/tls.key"
          tls_disable = "false"
          tls_disable_client_certs = "true"
          tls_require_and_verify_client_cert="false"
        }

        service_registration "kubernetes" {}
jdloft commented 1 year ago

Just a note @flyte, something like this with inotifywait works for us:

set -e
while inotifywait -e delete,delete_self /vault/userconfig/vault-server-tls/tls.crt /vault/userconfig/vault-ui-tls/tls.crt; do
  echo "Cert changed; Reloading vault"
  kill -HUP `pidof vault`
done

Volume mounts get re-linked which triggers DELETE_SELF.