Closed gabbler97 closed 6 months ago
Before attempting to use host networking, can you post the events (kubectl describe
) for the deployments (not the pods) after rolling them out to see if there's any info about why they didn't get injected? Also the events for the injector pod and its logs might prove to be useful.
Thank you for your answer @alpeb !
user@ip-10-x-x-65 ~ $ k logs linkerd-proxy-injector-55f86f4fc9-tsmgc -n linkerd
Defaulted container "linkerd-proxy" out of: linkerd-proxy, proxy-injector, linkerd-init (init)
[ 0.095648s] INFO ThreadId(01) linkerd2_proxy: release 2.224.0 (d91421a) by linkerd on 2024-03-28T18:07:05Z
[ 0.099989s] INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime
[ 0.101281s] INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191
[ 0.101298s] INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143
[ 0.101302s] INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140
[ 0.101305s] INFO ThreadId(01) linkerd2_proxy: Tap interface on 0.0.0.0:4190
[ 0.101309s] INFO ThreadId(01) linkerd2_proxy: SNI is linkerd-proxy-injector.linkerd.serviceaccount.identity.linkerd.cluster.local
[ 0.101312s] INFO ThreadId(01) linkerd2_proxy: Local identity is linkerd-proxy-injector.linkerd.serviceaccount.identity.linkerd.cluster.local
[ 0.101315s] INFO ThreadId(01) linkerd2_proxy: Destinations resolved via linkerd-dst-headless.linkerd.svc.cluster.local:8086 (linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local)
[ 0.104250s] INFO ThreadId(01) policy:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}: linkerd_pool_p2c: Adding endpoint addr=10.0.2.118:8090
[ 0.195414s] INFO ThreadId(01) dst:controller{addr=linkerd-dst-headless.linkerd.svc.cluster.local:8086}: linkerd_pool_p2c: Adding endpoint addr=10.0.2.118:8086
[ 0.202508s] INFO ThreadId(02) identity:identity{server.addr=linkerd-identity-headless.linkerd.svc.cluster.local:8080}:controller{addr=linkerd-identity-headless.linkerd.svc.cluster.local:8080}: linkerd_pool_p2c: Adding endpoint addr=10.0.31.152:8080
[ 0.315761s] INFO ThreadId(02) daemon:identity: linkerd_app: Certified identity id=linkerd-proxy-injector.linkerd.serviceaccount.identity.linkerd.cluster.local
user@ip-10-x-x-65 ~ $ k logs linkerd-proxy-injector-55f86f4fc9-tsmgc -n linkerd -c proxy-injector
time="2024-04-25T11:25:20Z" level=info msg="running version edge-24.3.5"
time="2024-04-25T11:25:20Z" level=info msg="starting admin server on :9995"
time="2024-04-25T11:25:20Z" level=info msg="waiting for caches to sync"
time="2024-04-25T11:25:20Z" level=info msg="listening at :8443"
time="2024-04-25T11:25:20Z" level=info msg="caches synced"
user@ip-10-x-x-65 ~ $ k logs linkerd-proxy-injector-55f86f4fc9-tsmgc -n linkerd -c linkerd-init
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy-save -t nat"
time="2024-04-25T11:25:12Z" level=info msg="# Generated by iptables-save v1.8.10 on Thu Apr 25 11:25:12 2024\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\nCOMMIT\n# Completed on Thu Apr 25 11:25:12 2024\n"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -N PROXY_INIT_REDIRECT"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_REDIRECT -p tcp --match multiport --dports 4190,4191,4567,4568 -j RETURN -m comment --comment proxy-init/ignore-port-4190,4191,4567,4568/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143 -m comment --comment proxy-init/redirect-all-incoming-to-proxy-port/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A PREROUTING -j PROXY_INIT_REDIRECT -m comment --comment proxy-init/install-proxy-init-prerouting/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -N PROXY_INIT_OUTPUT"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN -m comment --comment proxy-init/ignore-proxy-user-id/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN -m comment --comment proxy-init/ignore-loopback/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -p tcp --match multiport --dports 443,6443 -j RETURN -m comment --comment proxy-init/ignore-port-443,6443/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140 -m comment --comment proxy-init/redirect-all-outgoing-to-proxy-port/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy -t nat -A OUTPUT -j PROXY_INIT_OUTPUT -m comment --comment proxy-init/install-proxy-init-output/1714044312"
time="2024-04-25T11:25:12Z" level=info msg="/sbin/iptables-legacy-save -t nat"
time="2024-04-25T11:25:12Z" level=info msg="# Generated by iptables-save v1.8.10 on Thu Apr 25 11:25:12 2024\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\n:PROXY_INIT_OUTPUT - [0:0]\n:PROXY_INIT_REDIRECT - [0:0]\n-A PREROUTING -m comment --comment \"proxy-init/install-proxy-init-prerouting/1714044312\" -j PROXY_INIT_REDIRECT\n-A OUTPUT -m comment --comment \"proxy-init/install-proxy-init-output/1714044312\" -j PROXY_INIT_OUTPUT\n-A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -m comment --comment \"proxy-init/ignore-proxy-user-id/1714044312\" -j RETURN\n-A PROXY_INIT_OUTPUT -o lo -m comment --comment \"proxy-init/ignore-loopback/1714044312\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m multiport --dports 443,6443 -m comment --comment \"proxy-init/ignore-port-443,6443/1714044312\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m comment --comment \"proxy-init/redirect-all-outgoing-to-proxy-port/1714044312\" -j REDIRECT --to-ports 4140\n-A PROXY_INIT_REDIRECT -p tcp -m multiport --dports 4190,4191,4567,4568 -m comment --comment \"proxy-init/ignore-port-4190,4191,4567,4568/1714044312\" -j RETURN\n-A PROXY_INIT_REDIRECT -p tcp -m comment --comment \"proxy-init/redirect-all-incoming-to-proxy-port/1714044312\" -j REDIRECT --to-ports 4143\nCOMMIT\n# Completed on Thu Apr 25 11:25:12 2024\n"
And the events for the deployments
user@ip-10-x-x-65 ~ $ k describe deploy -n linkerd | grep Events
Events: <none>
Events: <none>
Events: <none>
user@ip-10-x-x-65 ~ $ k describe deploy -n goldilocks | grep Events
Events: <none>
Events: <none>
Also, can you post what you get from kubectl get mutatingwebhookconfigurations.admissionregistration.k8s.io linkerd-proxy-injector-webhook-config -oyaml
?
Yes of course!
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
annotations:
meta.helm.sh/release-name: linkerd-control-plane
meta.helm.sh/release-namespace: linkerd
labels:
app.kubernetes.io/managed-by: Helm
linkerd.io/control-plane-component: proxy-injector
linkerd.io/control-plane-ns: linkerd
name: linkerd-proxy-injector-webhook-config
webhooks:
- admissionReviewVersions:
- v1
- v1beta1
clientConfig:
caBundle: $CABUNDLE
service:
name: linkerd-proxy-injector
namespace: linkerd
path: /
port: 443
failurePolicy: Ignore
matchPolicy: Equivalent
name: linkerd-proxy-injector.linkerd.io
namespaceSelector:
matchExpressions:
- key: config.linkerd.io/admission-webhooks
operator: NotIn
values:
- disabled
- key: kubernetes.io/metadata.name
operator: NotIn
values:
- kube-system
- cert-manager
objectSelector:
matchExpressions:
- key: linkerd.io/control-plane-component
operator: DoesNotExist
- key: linkerd.io/cni-resource
operator: DoesNotExist
reinvocationPolicy: Never
rules:
- apiGroups:
- ""
apiVersions:
- v1
operations:
- CREATE
resources:
- pods
- services
scope: Namespaced
sideEffects: None
timeoutSeconds: 10
Any idea how should I continue? Thank you very much in advance!
Any clue? Thank you very much in advance!
Hi @gabbler97! Based on the output from linkerd check
, it seems that your control plane is not healthy. Looking more closely at the control plane logs, I do see a lot of failures from the control plane components to connect to each other. I'd suggest using Cilium's observability tools (such as Hubble) to ensure that Cilium is allowing traffic between the control plane components.
FWIW I've successfully tested linkerd with cilium in chained in hybrid mode with AWS VPC CNI, and it worked fine. Looking forward to what you find out about the control plane connectivity issues.
Thank you very much for your help! In the meantime I have found another way to avoid IPv4 exhaustion. In the future if somebody needs it it can be found here: https://aws.github.io/aws-eks-best-practices/networking/custom-networking/
What is the issue?
Linkerd proxy injection does not work with custom CNI (cilium) on AWS EKS clusters.
How can it be reproduced?
Install cilium
Install linkerd
Annotate the namespace for automatic injection
Delete the pods
Sidecar proxy should be injected and the last output should be
Logs, error output, etc
https://gist.github.com/gabbler97/6734dc908cf7136df49a8d2ba5e67eb9
output of
linkerd check -o short
Environment
Possible solution
No response
Additional context
I have tried the linkerd-proxy-injector with hostNetwork=true. In this case the proxy sidecar containers are injected automatically after a deployment rollout. Some nodes became not ready because the kubelet stopped sending status. After a given time (10 minutes it has benn resolved automatically). My pods which are interacting with the kube API server started to crashloopbackoff, but only on one specific node at a time (where the linkerd proxy injector pod was running):
Inside the pod logs I have found timeout for api server requests
Would you like to work on fixing this bug?
None