Open glerchundi opened 4 years ago
Coming from https://github.com/hashicorp/vault/issues/9711.
Would really like to have this feature or maybe a guide on how to do this
Would like to do this outside of K8s, on a baremetal standalone cluster.
this would be great; have any of you guys tried doing this but with cert-manager to manage the Let's Encrypt secrets?
this would be great; have any of you guys tried doing this but with cert-manager to manage the Let's Encrypt secrets?
We're using Vault this way actually, and it works well. The only problem we have is that when cert-manager renews the certificate, we have no way of notifying Vault so it reloads it from the secret volume mount. We're looking for a clean solution to accomplish that atm. Any ideas?
@iuriaranda would you mind adding more info.?
Thanks ;)
I have been thinking about this today and come up with a possible solution that could lead to an auto-managed deployment of Vault with Let's Encrypt:
cert-manager
using DNS01 because we don't want to deal with neither HTTP01 nor TLS01 cause both challenging methods would require some kind of a proxy.cert-manager-csi
for an up-to-date certificate volumeshareProcessNamespace
which notifies (SIGHUP) to Vault whenever it detects the certificate has changed: https://github.com/jetstack/cert-manager/issues/2929#issuecomment-653687475 or https://github.com/kubernetes/kubernetes/issues/24957#issuecomment-632881383Although the paper can take whatever you write on it, I'm pretty confident it could work.
I'm a little bit worried about the shareProcessNamespace
and its security implications though.
WDYT? /cc @jasonodonnell
@iuriaranda would you mind adding more info.?
- What challenging method are you using?
- If it's not the DNS based one, how do you solve the proxying problem with HTTP/TLS, using an Envoy/Nginx in front of Vault?
- How do you make the distinction between internal traffic and the external one? I mean, did you needed to add a specific/static IP address to the config like the way I explained?
Thanks ;)
Sorry for the late response, I missed your comment.
We're using the DNS challenge with cert-manager. For some of our setups, we still have a public LB in front of Vault to allow for external connections though.
External traffic goes through the load balancer, and cluster workloads can still reach Vault via the k8s service. We don't configure any static IP, Vault is deployed with the default listener, which afaik listens to 0.0.0.0
Thanks for your response @iuriaranda.
I assume that you're not using Integrated Storage, right? Because probably this is happening to us because we want to have both at the same time: lets encrypt protected public addr & tls protected internal comm.
Why two different certs? Because lets encrypt only provides the certificate to be used with our corp domain: vault.corpdomain.com
. The internal TLS instead includes all the required SANs: vault-0.vault-internal
, vault-1.vault-internal
, ...
So that we can define our config as follows through the ha.raft.config
parameter:
listener "tcp" {
address = "0.0.0.0:8200"
tls_key_file = "/vault/userconfig/vault-tls/vault-key.pem"
tls_cert_file = "/vault/userconfig/vault-tls/vault.pem"
tls_client_ca_file = "/vault/userconfig/vault-tls/ca.pem"
tls_disable_client_certs = true
}
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "https://vault-0.vault-internal:8200"
leader_client_key_file = "/vault/userconfig/vault-tls/vault-key.pem"
leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
leader_ca_cert_file = "/vault/userconfig/vault-tls/ca.pem"
}
retry_join {
leader_api_addr = "https://vault-1.vault-internal:8200"
leader_client_key_file = "/vault/userconfig/vault-tls/vault-key.pem"
leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
leader_ca_cert_file = "/vault/userconfig/vault-tls/ca.pem"
}
retry_join {
leader_api_addr = "https://vault-2.vault-internal:8200"
leader_client_key_file = "/vault/userconfig/vault-tls/vault-key.pem"
leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
leader_ca_cert_file = "/vault/userconfig/vault-tls/ca.pem"
}
retry_join {
leader_api_addr = "https://vault-3.vault-internal:8200"
leader_client_key_file = "/vault/userconfig/vault-tls/vault-key.pem"
leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
leader_ca_cert_file = "/vault/userconfig/vault-tls/ca.pem"
}
retry_join {
leader_api_addr = "https://vault-4.vault-internal:8200"
leader_client_key_file = "/vault/userconfig/vault-tls/vault-key.pem"
leader_client_cert_file = "/vault/userconfig/vault-tls/vault.pem"
leader_ca_cert_file = "/vault/userconfig/vault-tls/ca.pem"
}
}
The idea would be to create another listener
stanza just for the sake of listening on it for public requests, for example:
listener "tcp" {
address = "0.0.0.0:8202"
tls_key_file = "/vault/userconfig/vault-lets-encrypt/vault-key.pem"
tls_cert_file = "/vault/userconfig/vault-lets-encrypt/vault.pem"
tls_disable_client_certs = true
}
Note 8202 being used as listening port and using Let's Encrypt provided certificates.
Then configure the ui-service.yaml
to have a custom targetPort
and different to the default 8200, which is hardcoded: https://github.com/hashicorp/vault-helm/blob/master/templates/ui-service.yaml#L28
WDYT @jasonodonnell, would you be open to a PR to customize that ui-service.yaml
targetPort
through another values.yaml
parameter?
/cc @dcanadillas
In an ideal world, this workaround code I proposed here (cert-manager-csi
+ inotify
) could make it (somehow) into the Vault core in the same vein it does with service_registration "kubernetes" {}
.
I'm envisioning something like this:
listener "tcp" {
address = "0.0.0.0:8202"
tls_key_k8s_secret = "vault-lets-encrypt"
}
I created an issue in Vault for a feature request I feel the core team could be willing to accept: https://github.com/hashicorp/vault/issues/10615
this would be great; have any of you guys tried doing this but with cert-manager to manage the Let's Encrypt secrets?
We're using Vault this way actually, and it works well. The only problem we have is that when cert-manager renews the certificate, we have no way of notifying Vault so it reloads it from the secret volume mount. We're looking for a clean solution to accomplish that atm. Any ideas?
Something like the following would work:
#!/bin/sh
set -eu
cert_checksum="$(sha256sum /tls/tls.crt | awk '{ print $1 }')"
echo "$(date -u) — Current checksum '$cert_checksum'"
inotifywait -q -m /tls |
while read -r path action file; do
if [ "$file" = "tls.crt" ]; then
new_cert_checksum="$(sha256sum /tls/tls.crt | awk '{ print $1 }')"
if [ "$cert_checksum" != "$new_cert_checksum" ]; then
echo "$(date -u) — New checksum '$new_cert_checksum'"
cert_checksum="$new_cert_checksum"
vault_pid=$(cat /pids/vault_server.pid)
echo "$(date -u) — Sending SIGHUP signal to Vault (pid=$vault_pid)"
kill -SIGHUP "$vault_pid"
fi
fi
done
Something like the following would work
I've made a container image based on this pattern, but I removed inotifywait because it was triggering constantly on OPEN, ACCESS, CLOSE_NOWRITE and CLOSE, but didn't actually trigger on modification events (something to do with secrets mounting I suppose). This caused the sha256sum call to be running almost constantly.
This version just checks the sha256sum of the cert each minute and reloads vault using killall
instead of having to find the PID from somewhere.
#!/bin/sh
set -e
cert_path="$1"
if [ "$cert_path" = "" ]; then
echo "Must include path to cert as first argument"
exit 1
fi
set -u
cert_hash="$(sha256sum $cert_path | awk '{ print $1 }')"
echo "$(date -u) - Current checksum '$cert_hash'"
while [ 1 ]; do
new_cert_hash="$(sha256sum $cert_path | awk '{ print $1 }')"
if [ "$cert_hash" != "$new_cert_hash" ]; then
echo "$(date -u) - New checksum '$new_cert_hash'"
cert_hash="$new_cert_hash"
echo "$(date -u) - Sending SIGHUP signal to Vault"
killall -SIGHUP "vault"
fi
sleep 60
done
Here's how I've integrated it into the helm chart:
extraContainers:
- name: cert-watcher
image: ghcr.io/flyte/docker-vault-cert-reloader:1.0.4
args:
- /var/run/secrets/vault-tls/tls.crt
volumeMounts:
- name: vault-tls
mountPath: /var/run/secrets/vault-tls
readOnly: true
shareProcessNamespace: true
Git repo here: https://github.com/flyte/docker-vault-cert-reloader
I tried the workaround from @flyte and it works as I expected. Thanks.
@iuriaranda can you provide a bit more detail on how you accomplished this https://github.com/hashicorp/vault-helm/issues/385#issuecomment-749401401, particularly how you configured the cert request to cert manager? Are you using the same cert from cert manager for both internal (within the cluster) and external traffic? Or are you using 2 separate certs because of vault's SAN requirements? Do you mind sharing vault helm values file?
My issue is that i have a single vault cluster on a separate k8s/AWS EKS instance providing services to several / separate k8s/AWS EKS clusters and i need a signed cert for both external (UI) and internal (API) communications. The separate k8s clusters are using the vault-agent-injector to communicate with the vault instance over HTTPS (e.g. api.vault.example.com) and the UI is accessible from HTTPS as well (ui.vault.example.com). I'm currently using a self-signed cert but this does not work for the API in k8s 1.21 using short-lived tokens (https://www.vaultproject.io/docs/auth/kubernetes#kubernetes-1-21) failing with the error below.
x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"
hi @glerchundi i see that your PR https://github.com/hashicorp/vault-helm/pull/437 got merged, were you able to get your suggested configuration https://github.com/hashicorp/vault-helm/issues/385#issuecomment-749560213 working?
i was able to resolve my issue. for anyone coming across this, i created separate vault listeners for the vault UI and API and i'm using let's encrypt for the cert. i run k8s on AWS EKS and i manually edited the vault-active
service to add the additional 8203
port so it can persist on the AWS NLB.
i haven't gotten this https://github.com/hashicorp/vault-helm/issues/385#issuecomment-1157883999 to work yet, but that's next. for the internal
vault listener, i'm still using a self-signed generated cert. below is the HA config that is currently working.
ha:
enabled: true
replicas: 3
raft:
enabled: true
setNodeId: true
config: |
ui = true
# listener for the vault cluster
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
tls_disable = "false"
tls_disable_client_certs = "true"
tls_require_and_verify_client_cert="false"
}
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "https://vault-0.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
}
retry_join {
leader_api_addr = "https://vault-1.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
}
retry_join {
leader_api_addr = "https://vault-2.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
}
}
# external listener for the vault UI
listener "tcp" {
address = "0.0.0.0:8202"
tls_key_file = "/vault/userconfig/vault-server-tls-letsencrypt/tls.key"
tls_cert_file = "/vault/userconfig/vault-server-tls-letsencrypt/tls.crt"
tls_disable = false
tls_disable_client_certs = "true"
tls_require_and_verify_client_cert="false"
}
# external listener for the vault API
listener "tcp" {
address = "[::]:8203"
tls_cert_file = "/vault/userconfig/vault-server-tls-letsencrypt/tls.crt"
tls_key_file = "/vault/userconfig/vault-server-tls-letsencrypt/tls.key"
tls_disable = "false"
tls_disable_client_certs = "true"
tls_require_and_verify_client_cert="false"
}
service_registration "kubernetes" {}
Just a note @flyte, something like this with inotifywait
works for us:
set -e
while inotifywait -e delete,delete_self /vault/userconfig/vault-server-tls/tls.crt /vault/userconfig/vault-ui-tls/tls.crt; do
echo "Cert changed; Reloading vault"
kill -HUP `pidof vault`
done
Volume mounts get re-linked which triggers DELETE_SELF.
Is your feature request related to a problem? Please describe.
In case Vault is secured end-to-end with self signed certificates (which seems to be the most common way of deploying it), anyone who is going to access to Vault needs to have the CA pubkey to verify the authenticity of the certificate & avoid MiTM attacks.
This requires to have a method to provision this CA in a secure way, which is not completely trivial. It can be done by putting it in a place where it is already trusted/verified like the company website:
https://mycompany.com/ca.pem
.Describe the solution you'd like
Ideally I would like to avoid the people administrating Vault the hassle of trusting this CA. I don't know if it's even possible but here it is my proposal:
listener "tcp" { address "1.2.3.4:8200" }
.api_addr
&cluster_addr
would use internal kubernetes pod ip addresses.listener "tcp" { tls_{cert,key}_file "/cert/from/letsencrypt-acme.{cert,key}" }
cert-manager
and add support for automatic retrieval&renewalDescribe alternatives you've considered
Providing a custom CA cert out of band.