Closed gopisaba closed 4 years ago
Hi @gopisaba, I just deployed this on EKS in different namespaces, but could not reproduce what you're seeing.
Can you provide me with:
values
that you deployed this withkubectl describe service vault-agent-injector-svc -n infra-tools
global:
enabled: true
tlsDisable: false
injector:
certs:
secretName: vault-tls
server:
auditStorage:
accessMode: ReadWriteOnce
enabled: true
size: 10Gi
storageClass: null
authDelegator:
enabled: true
dataStorage:
enabled: false
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-tls/tls.ca
extraVolumes:
- name: vault-tls
type: secret
ha:
config: |
ui = true
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-tls/tls.crt"
tls_key_file = "/vault/userconfig/vault-tls/tls.key"
tls_client_ca_file = "/vault/userconfig/vault-tls/tls.ca"
}
storage "dynamodb" {
ha_enabled = "true"
region = "eu-west-1"
table = "vault-backend"
}
seal "awskms" {
region = "eu-west-1"
kms_key_id = "1ee6b01a-1d8a-4cfb-abcd-12bdc43ab8d2"
endpoint = "https://vpce-01234567890-6abcdef.kms.eu-west-1.vpce.amazonaws.com"
}
enabled: true
replicas: 3
ingress:
enabled: false
nodeSelector: |
nodeType: grp1
standalone:
enabled: false
ui:
enabled: true
serviceNodePort: 32582
serviceType: NodePort
Kube Version = 1.14 (EKS)
✔ k describe svc vault-agent-injector-svc -n infra-tools
Name: vault-agent-injector-svc
Namespace: infra-tools
Labels: app.kubernetes.io/instance=vault
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=vault-agent-injector
Annotations: flux.weave.works/antecedent: infra-tools:helmrelease/vault
Selector: app.kubernetes.io/instance=vault,app.kubernetes.io/name=vault-agent-injector,component=webhook
Type: ClusterIP
IP: 172.20.174.199
Port: <unset> 443/TCP
TargetPort: 8080/TCP
Endpoints: 100.64.3.76:8080
Session Affinity: None
Events: <none>
@gopisaba could be the same problem I had https://github.com/hashicorp/vault-k8s/issues/46
@krep-dr - That's it. After opening the port 8080
between EKS cluster and worker nodes, the mutation webhook started working.
Thanks for pointing me to the right direction
@gopisaba what is the EKS cluster IP range? Or how can I find out the range? I do have the same issue. Thanks!
@DongshengXiong - Allowing EKS cluster security group to EKS worker nodes security group over the port 8080 fixed the issue for me.
@gopisaba thanks for your reply. Actually, I am using Weave Net CNI. My issue is fixed by this solution(https://github.com/hashicorp/vault-k8s/issues/72)
Hi @DongshengXiong what specifically did you change on the EKS security group? Did you use eksctl to set up your cluster? If so, which security group did you change and which security group was the source for the inbound rule?
Hi @DongshengXiong what specifically did you change on the EKS security group? Did you use eksctl to set up your cluster? If so, which security group did you change and which security group was the source for the inbound rule?
I know its been a while since this was asked and you probably know the answer by now, but for anyone else, there are two security groups you will need to change, one for inbound and one for outbound.
1) There should be a security group named something like "
Read comments on inbound and outbound security rules to figure out which group is used for what.
I ran into this issue the other day when using terraform to deploy the terraform-aws-modules/eks/aws
"eks module" and wanted to share my fixes, in hopes that the next person doing this will find this helpful.
When defining the EKS module, you need to add the following node_security_group_additional_rules:
node_security_group_additional_rules = {
ingress_vault_injector_webhook = {
description = "Access to Vault Agent Injector webhook endpoint from API server"
protocol = "tcp"
from_port = 8080
to_port = 8080
type = "ingress"
source_cluster_security_group = true
}
}
This solution works well in the EKS cluster. Thanks to @kschoche!
kube-apiserver
get big latency for response included rollout restart, pod terminating, containerCreating and etc.kube-apiserver
error log repeated in CloudWatch Logs:E0610 20:50:30.214031 10 dispatcher.go:214] failed calling webhook "vault.hashicorp.com": failed to call webhook: Post "[https://vault-agent-injector-svc.vault.svc:443/mutate?timeout=30s](https://vault-agent-injector-svc.vault.svc/mutate?timeout=30s)": context deadline exceeded
enabled
vault-agent-injector
pod responds to MutatingWebhook calls through the MutatingWebhookConfiguration named vault-agent-injector-cfg
and typically uses port tcp/8080
.
In official vault
helm chart, the values related to vault-agent-injector
pod are as follows:
# vault-helm/values.yaml
injector:
# True if you want to enable vault agent injection.
# @default: global.enabled
enabled: "-"
replicas: 1
# Configures the port the injector should listen on
port: 8080
So add an inbound rule to the worker node security group (SG) to allow TCP 8080 with the Control Plane as the source.
---
title: Kubernetes architecture (EKS v1.30)
---
flowchart LR
subgraph Control plane
C["kube-apiserver"]
end
S["vault-agent-injector-svc"]
subgraph Worker node
P["vault-agent-injector"]
end
C --"tcp/443"--> S:::blue -. tcp/8080 .-> P
classDef blue stroke:#00f
Add an inbound rule for tcp port 8080 to node_security_group_additional_rules
value provided by the EKS module.
module "eks" {
# ... truncated ...
node_security_group_additional_rules = {
ingress_vault_agent_injector_mutating_webhook = {
description = "Allow ingress mutating webhook traffic from kube-apiserver to vault-agent-injector pod"
protocol = "tcp"
from_port = 8080
to_port = 8080
type = "ingress"
source_cluster_security_group = true
}
# Similar case for linkerd-viz tap pod's api service
ingress_linkerd_viz_tap_api = {
description = "Allow ingress api calling traffic from kube-apiserver to linkerd-viz tap pod"
protocol = "tcp"
from_port = 8088
to_port = 8089
type = "ingress"
source_cluster_security_group = true
}
}
# ... truncated ...
}
Similar case Linkerd-Viz Tap FailedDiscoveryCheck while Running on EKS
I am using the latest Vault Helm chart. The mutation webhook is failing to inject the vault-agent and consul-template sidecars.
Error messages on EKS api-server logs
I don't see any other error messages on vault or vault-agent-injector pod. I am able to resolve and connect to the vault-agent-injector-svc from test pod in different namespace.
vault svc