Is your feature request related to a problem? Please describe.
Cluster communication timeout's are too tight by default when bootstrapping the Vault cluster in helm/k8s.
What is interesting is that this isn't a default setting in the helm chart (which it should be) to account for the increased latency between versions.
Adding it here fixed the issue entirely, and when the vault gets unsealed, the keys are properly outputted to the CLI without timeout.
set {
name = "server.extraEnvironmentVars.VAULT_CLIENT_TIMEOUT"
value = "300s"
}
I am thinking that increasing the timeout may help account for network latency in k8s/ eks.
Full chart settings that worked:
resource "helm_release" "vault" {
name = "vault"
repository = "https://helm.releases.hashicorp.com"
chart = "vault"
namespace = "vault"
set {
name = "server.ha.enabled"
value = "true"
}
set {
name = "server.ha.raft.enabled"
value = "true"
}
set {
name = "server.ha.raft.setNodeId"
value = "true"
}
set {
name = "server.extraEnvironmentVars.VAULT_CLIENT_TIMEOUT"
value = "300s"
}
set {
name = "server.ha.raft.config"
value = <<EOT
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "raft" {
path = "/vault/data"
}
service_registration "kubernetes" {}
EOT
}
}
Thanks.
Describe alternatives you've considered
n/a
Additional context
Took me a while to find the root cause, which was attributed to a go language error message "context deadline exceeded" which then got me to look at ways to increase the timeout value.
Note this is just a suggestion or breadcrumb for others.
Is your feature request related to a problem? Please describe. Cluster communication timeout's are too tight by default when bootstrapping the Vault cluster in helm/k8s.
e.g. Running
vault operator init
Describe the solution you'd like Seems that EKS 1.24 (from 1.21) adds additional network latency within the cluster. https://support.hashicorp.com/hc/en-us/articles/8552873602451-Vault-on-Kubernetes-and-context-deadline-exceeded-errors
What is interesting is that this isn't a default setting in the helm chart (which it should be) to account for the increased latency between versions.
Adding it here fixed the issue entirely, and when the vault gets unsealed, the keys are properly outputted to the CLI without timeout.
I am thinking that increasing the timeout may help account for network latency in k8s/ eks.
Full chart settings that worked:
Thanks.
Describe alternatives you've considered n/a
Additional context Took me a while to find the root cause, which was attributed to a go language error message "context deadline exceeded" which then got me to look at ways to increase the timeout value.
Note this is just a suggestion or breadcrumb for others.