Open cheeseburgermotivated opened 6 months ago
Hi there - I asked our resident Vault/Consul expert and she said that if you're in need of workarounds, this might help. In the meantime I'll bring this to our engineering teams as well. Thanks!
Thanks for the update. We kinda-sorta did something similar to recover. We turned off verify_incoming for the consul server nodes, requested a new global-management token from Vault, logged in to the UI with that token, dug around in the issued tokens until we happened to find the token that was initially issued for bootstrap, put that token at a path in Vault, and then use data.vault_generic_secret to pull that token from Vault for use within the vault_consul_secret_backend. Then we turned verify_incoming back on. However, I do like your idea better so I'll play with it in our lab.
Thanks for bringing this up to the engineering team as well as the Vault/Consul expert. I'll remain subscribed to this issue for future updates.
Describe the bug Rotating the client certificate/key used in a Consul secret backend for Vault does not seem possible when Vault itself is used to bootstrap the Consul ACL system. The
/{mount}/config/access
endpoint which is used to set the certificate/key will overwrite all fields, so when we send an updated cert without the bootstrap token, Vault assumes that we want to bootstrap the ACL system of the associated Consul cluster, which fails since Consul says No. We don't have the token because Vault swallows it when it issues the bootstrap command to Consul, and we shouldn't need to know it since Vault is in charge of the process and should have it stored somewhere.This failure looks like:
The above did not actually result in a failure immediately, however. The old certificate was deleted, the new one was generated, but the consul secrets engine was not yet aware of the change. After the certificate actually expired, communication between Consul and Vault stopped. This failure looked like so:
To Reproduce Steps to reproduce the behavior:
Create docker network a.
docker network create --driver bridge bstok
Launch Vault container a.
docker run -dit --name vault.superveryreallydefinitelybogus.com --network bstok -p 8200:8200 --cap-add=IPC_LOCK -e 'VAULT_DEV_ROOT_TOKEN_ID=myroot' -e 'VAULT_DEV_LISTEN_ADDRESS=0.0.0.0:8200' hashicorp/vault
Create sample PKI infrastructure using Terraform a.
export VAULT_ADDR=http://localhost:8200
b.export VAULT_TOKEN=myroot
c.mkdir -p bstok/terraform bstok/docker/consul/config/tls
d.provider "vault" { address = "http://localhost:8200" }
variable "base_domain" { type = string description = "The domain name the CA will issue certificates for" default = "superveryreallydefinitelybogus.com" }
root CA
resource "vault_mount" "pki_root" { path = "pki_root" type = "pki" description = "This is an example PKI root"
max_lease_ttl_seconds = 315360000 #10y }
resource "vault_pki_secret_backend_root_cert" "root" { backend = vault_mount.pki_root.path type = "internal" common_name = var.base_domain ttl = "87600h" #10y }
resource "vault_pki_secret_backend_config_urls" "config_urls" { backend = vault_mount.pki_root.path issuing_certificates = ["http://localhost:8200/v1/pki/ca"] crl_distribution_points = ["http://localhost:8200/v1/pki/crl"] }
intermediate CA
resource "vault_mount" "pki_intermediate" { path = "pki_intermediate" type = "pki" description = "This is an example PKI intermediate"
max_lease_ttl_seconds = 15780000 #5y }
resource "vault_pki_secret_backend_intermediate_cert_request" "intermediate_request" { backend = vault_mount.pki_intermediate.path type = "internal" common_name = "${var.base_domain} Intermediate Authority" }
resource "vault_pki_secret_backend_root_sign_intermediate" "signed_intermediate" { backend = vault_mount.pki_root.path csr = vault_pki_secret_backend_intermediate_cert_request.intermediate_request.csr common_name = vault_pki_secret_backend_intermediate_cert_request.intermediate_request.common_name }
resource "vault_pki_secret_backend_intermediate_set_signed" "set_signed" { backend = vault_mount.pki_intermediate.path certificate = vault_pki_secret_backend_root_sign_intermediate.signed_intermediate.certificate }
roles
resource "vault_pki_secret_backend_role" "server_role" { backend = vault_mount.pki_intermediate.path name = "server_role" max_ttl = 259200 #72h
allowed_domains = [var.base_domain] allowed_uri_sans = ["server.dc1.consul"] allow_any_name = false allow_glob_domains = true allow_ip_sans = true allow_subdomains = true enforce_hostnames = true
client_flag = false server_flag = true }
client certs
resource "vault_pki_secret_backend_cert" "vault-consul" { backend = vault_mount.pki_intermediate.path name = vault_pki_secret_backend_role.server_role.name
common_name = "consul.${var.base_domain}" ttl = 3000 min_seconds_remaining = 2400 auto_renew = true }
resource "vault_pki_secret_backend_cert" "consul-server" { backend = vault_mount.pki_intermediate.path name = vault_pki_secret_backend_role.server_role.name
common_name = "consul.${var.base_domain}" ttl = 28800 min_seconds_remaining = 14400 auto_renew = true }
consul secrets engine
resource "vault_consul_secret_backend" "consul" {
path = "consul"
address = "consul.superveryreallydefinitelybogus.com:8501"
scheme = "https"
bootstrap = true
ca_cert = vault_pki_secret_backend_cert.vault-consul.ca_chain
client_cert = join("\n", [
vault_pki_secret_backend_cert.vault-consul.certificate,
vault_pki_secret_backend_cert.vault-consul.ca_chain
])
client_key = vault_pki_secret_backend_cert.vault-consul.private_key
}
EOF
cat << EOF > docker/consul/config/config.json { "server": true, "log_level": "DEBUG", "node_name": "consul-docker", "tls": { "defaults": { "ca_file": "/consul/config/tls/ca.crt", "cert_file": "/consul/config/tls/certcombined.crt", "key_file": "/consul/config/tls/cert.key", "verify_incoming": true, "verify_outgoing": true } }, "acl": { "enabled": true, "default_policy": "allow" }, "ports": { "https": 8501, "grpc_tls": 8503 }, "encrypt": "pmsKacTdVOb4x8/Vtr9PWw==" } EOF
vault_consul_secret_backend.consul: Creating... vault_consul_secret_backend.consul: Creation complete after 0s [id=consul]
2024-04-26 11:07:45 2024-04-26T15:07:45.084Z [INFO] agent.server.acl: ACL bootstrap completed 2024-04-26 11:07:45 2024-04-26T15:07:45.087Z [DEBUG] agent.http: Request finished: method=PUT url=/v1/acl/bootstrap from=172.20.0.2:55116 latency=6.781417ms 2024-04-26 11:08:48 2024-04-26T15:08:48.791Z [DEBUG] agent: Skipping remote check since it is managed automatically: check=serfHealth