Closed Nutties93 closed 1 month ago
@Nutties93 Happy new year!
So, first, linkerd-policy
is a headless service, so it's OK that it doesn't have a ClusterIP -- that's to be expected.
Beyond that, though, could we get a kubectl describe
from the linkerd-destination
pod, and the logs from its policy
container?
DPOD=$(kubectl get pods -n linkerd -l 'linkerd.io/control-plane-component=destination' -o jsonpath='{ .items[0].metadata.name }')
kubectl describe -n linkerd pod $DPOD
kubectl logs -n linkerd $DPOD -c policy
Thanks!! 🙂
Hey, I'm experiencing the same issue with k3s and linkerd, installed on 3 ec2 nodes (1 master, 2 workers).
Here's the output of commands you asked for @kflynn
DPOD=$(kubectl get pods -n linkerd -l 'linkerd.io/control-plane-component=destination' -o jsonpath='{ .items[0].metadata.name }')
kubectl describe -n linkerd pod $DPOD
kubectl logs -n linkerd $DPOD -c policy
Name: linkerd-destination-56db447bcf-klhfn
Namespace: linkerd
Priority: 0
Service Account: linkerd-destination
Node: ip-172-31-25-125/172.31.25.125
Start Time: Thu, 22 Feb 2024 10:27:31 +0100
Labels: linkerd.io/control-plane-component=destination
linkerd.io/control-plane-ns=linkerd
linkerd.io/proxy-deployment=linkerd-destination
linkerd.io/workload-ns=linkerd
pod-template-hash=56db447bcf
Annotations: checksum/config: bd31c9c8aacd5b84e1e057813a312b61e4dfe2b66407a211e383b47cc0f7860b
cluster-autoscaler.kubernetes.io/safe-to-evict: true
config.linkerd.io/default-inbound-policy: all-unauthenticated
linkerd.io/created-by: linkerd/cli stable-2.14.10
linkerd.io/proxy-version: stable-2.14.10
linkerd.io/trust-root-sha256: f36265f164549710dc03ae3e4898aa20c119fac16c4a79e3ef34b838cb119851
Status: Running
SeccompProfile: RuntimeDefault
IP: 10.42.1.13
IPs:
IP: 10.42.1.13
Controlled By: ReplicaSet/linkerd-destination-56db447bcf
Init Containers:
linkerd-init:
Container ID: containerd://4ad034606556e23333e9c695df744a1be54781cbff257f30db7aeb0813c441ca
Image: cr.l5d.io/linkerd/proxy-init:v2.2.3
Image ID: cr.l5d.io/linkerd/proxy-init@sha256:1075bc22a4a8f0852311dc84c9db0552f1245d07fe4fdebd4bc6cf4566bcbc76
Port: <none>
Host Port: <none>
SeccompProfile: RuntimeDefault
Args:
--incoming-proxy-port
4143
--outgoing-proxy-port
4140
--proxy-uid
2102
--inbound-ports-to-ignore
4190,4191,4567,4568
--outbound-ports-to-ignore
443,6443
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 22 Feb 2024 10:27:32 +0100
Finished: Thu, 22 Feb 2024 10:27:32 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 20Mi
Requests:
cpu: 100m
memory: 20Mi
Environment: <none>
Mounts:
/run from linkerd-proxy-init-xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j94nv (ro)
Containers:
linkerd-proxy:
Container ID: containerd://3632756adce57c439565c9156401402c84d878bbeb311d92ca495ecf67919b3e
Image: cr.l5d.io/linkerd/proxy:stable-2.14.10
Image ID: cr.l5d.io/linkerd/proxy@sha256:7876cee0717575ebc39d2b7cfd701e0df28a809bcb2cf4974716a0bce1ce32cb
Ports: 4143/TCP, 4191/TCP
Host Ports: 0/TCP, 0/TCP
SeccompProfile: RuntimeDefault
State: Waiting
Reason: PostStartHookError
Last State: Terminated
Reason: Error
Message: nection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 146.632250s] WARN ThreadId(01) watch{port=9990}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 147.133080s] WARN ThreadId(01) watch{port=9990}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 147.634038s] WARN ThreadId(01) watch{port=9990}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 148.134891s] WARN ThreadId(01) watch{port=9990}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 148.636371s] WARN ThreadId(01) watch{port=9990}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 149.138241s] WARN ThreadId(01) watch{port=9990}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 149.639188s] WARN ThreadId(01) watch{port=9990}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
[ 150.017773s] WARN ThreadId(01) linkerd_app: Waiting for identity to be initialized...
Exit Code: 137
Started: Thu, 22 Feb 2024 10:27:33 +0100
Finished: Thu, 22 Feb 2024 10:30:03 +0100
Ready: False
Restart Count: 0
Liveness: http-get http://:4191/live delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:4191/ready delay=2s timeout=1s period=10s #success=1 #failure=3
Environment:
_pod_name: linkerd-destination-56db447bcf-klhfn (v1:metadata.name)
_pod_ns: linkerd (v1:metadata.namespace)
_pod_nodeName: (v1:spec.nodeName)
LINKERD2_PROXY_LOG: warn,linkerd=info,trust_dns=error
LINKERD2_PROXY_LOG_FORMAT: plain
LINKERD2_PROXY_DESTINATION_SVC_ADDR: localhost.:8086
LINKERD2_PROXY_DESTINATION_PROFILE_NETWORKS: 10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16
LINKERD2_PROXY_POLICY_SVC_ADDR: localhost.:8090
LINKERD2_PROXY_POLICY_WORKLOAD: $(_pod_ns):$(_pod_name)
LINKERD2_PROXY_INBOUND_DEFAULT_POLICY: all-unauthenticated
LINKERD2_PROXY_POLICY_CLUSTER_NETWORKS: 10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16
LINKERD2_PROXY_INBOUND_CONNECT_TIMEOUT: 100ms
LINKERD2_PROXY_OUTBOUND_CONNECT_TIMEOUT: 1000ms
LINKERD2_PROXY_OUTBOUND_DISCOVERY_IDLE_TIMEOUT: 5s
LINKERD2_PROXY_INBOUND_DISCOVERY_IDLE_TIMEOUT: 90s
LINKERD2_PROXY_CONTROL_LISTEN_ADDR: 0.0.0.0:4190
LINKERD2_PROXY_ADMIN_LISTEN_ADDR: 0.0.0.0:4191
LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR: 127.0.0.1:4140
LINKERD2_PROXY_INBOUND_LISTEN_ADDR: 0.0.0.0:4143
LINKERD2_PROXY_INBOUND_IPS: (v1:status.podIPs)
LINKERD2_PROXY_INBOUND_PORTS: 8086,8090,8443,9443,9990,9996,9997
LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES: svc.cluster.local.
LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE: 10000ms
LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE: 10000ms
LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION: 25,587,3306,4444,5432,6379,9300,11211
LINKERD2_PROXY_DESTINATION_CONTEXT: {"ns":"$(_pod_ns)", "nodeName":"$(_pod_nodeName)", "pod":"$(_pod_name)"}
_pod_sa: (v1:spec.serviceAccountName)
_l5d_ns: linkerd
_l5d_trustdomain: cluster.local
LINKERD2_PROXY_IDENTITY_DIR: /var/run/linkerd/identity/end-entity
LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS: <set to the key 'ca-bundle.crt' of config map 'linkerd-identity-trust-roots'> Optional: false
LINKERD2_PROXY_IDENTITY_TOKEN_FILE: /var/run/secrets/tokens/linkerd-identity-token
LINKERD2_PROXY_IDENTITY_SVC_ADDR: linkerd-identity-headless.linkerd.svc.cluster.local.:8080
LINKERD2_PROXY_IDENTITY_LOCAL_NAME: $(_pod_sa).$(_pod_ns).serviceaccount.identity.linkerd.cluster.local
LINKERD2_PROXY_IDENTITY_SVC_NAME: linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local
LINKERD2_PROXY_DESTINATION_SVC_NAME: linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
LINKERD2_PROXY_POLICY_SVC_NAME: linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
Mounts:
/var/run/linkerd/identity/end-entity from linkerd-identity-end-entity (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j94nv (ro)
/var/run/secrets/tokens from linkerd-identity-token (rw)
destination:
Container ID: containerd://ac08bbe37b8e439a9a50beb49e0ca0f6aa91ba4051e42758daf9147714f73704
Image: cr.l5d.io/linkerd/controller:stable-2.14.10
Image ID: cr.l5d.io/linkerd/controller@sha256:65bed6a346b259cb1ff04420ee296afa28c38cb3e789ce285e5987f039dddf45
Ports: 8086/TCP, 9996/TCP
Host Ports: 0/TCP, 0/TCP
SeccompProfile: RuntimeDefault
Args:
destination
-addr=:8086
-controller-namespace=linkerd
-enable-h2-upgrade=true
-log-level=info
-log-format=plain
-enable-endpoint-slices=true
-cluster-domain=cluster.local
-identity-trust-domain=cluster.local
-default-opaque-ports=25,587,3306,4444,5432,6379,9300,11211
-enable-pprof=false
State: Running
Started: Thu, 22 Feb 2024 10:30:03 +0100
Ready: False
Restart Count: 0
Liveness: http-get http://:9996/ping delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9996/ready delay=0s timeout=1s period=10s #success=1 #failure=7
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j94nv (ro)
sp-validator:
Container ID: containerd://bf47e711ac359a520ae36274f3fa52fc72909d1b1328332a638566b246de9664
Image: cr.l5d.io/linkerd/controller:stable-2.14.10
Image ID: cr.l5d.io/linkerd/controller@sha256:65bed6a346b259cb1ff04420ee296afa28c38cb3e789ce285e5987f039dddf45
Ports: 8443/TCP, 9997/TCP
Host Ports: 0/TCP, 0/TCP
SeccompProfile: RuntimeDefault
Args:
sp-validator
-log-level=info
-log-format=plain
-enable-pprof=false
State: Running
Started: Thu, 22 Feb 2024 10:30:03 +0100
Ready: False
Restart Count: 0
Liveness: http-get http://:9997/ping delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9997/ready delay=0s timeout=1s period=10s #success=1 #failure=7
Environment: <none>
Mounts:
/var/run/linkerd/tls from sp-tls (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j94nv (ro)
policy:
Container ID: containerd://1300c79bd7d067b94ba96c44d9f64963701252b21391f8afbd9c51c082e06cec
Image: cr.l5d.io/linkerd/policy-controller:stable-2.14.10
Image ID: cr.l5d.io/linkerd/policy-controller@sha256:763ccb8651e3ba93507732205c18b8dc2d15da94b4f5d04e8683c3f92f0c5ebe
Ports: 8090/TCP, 9990/TCP, 9443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
SeccompProfile: RuntimeDefault
Args:
--admin-addr=0.0.0.0:9990
--control-plane-namespace=linkerd
--grpc-addr=0.0.0.0:8090
--server-addr=0.0.0.0:9443
--server-tls-key=/var/run/linkerd/tls/tls.key
--server-tls-certs=/var/run/linkerd/tls/tls.crt
--cluster-networks=10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16
--identity-domain=cluster.local
--cluster-domain=cluster.local
--default-policy=all-unauthenticated
--log-level=info
--log-format=plain
--default-opaque-ports=25,587,3306,4444,5432,6379,9300,11211
--probe-networks=0.0.0.0/0
State: Running
Started: Thu, 22 Feb 2024 10:30:03 +0100
Ready: False
Restart Count: 0
Liveness: http-get http://:admin-http/live delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:admin-http/ready delay=10s timeout=1s period=10s #success=1 #failure=7
Environment: <none>
Mounts:
/var/run/linkerd/tls from policy-tls (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j94nv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
sp-tls:
Type: Secret (a volume populated by a Secret)
SecretName: linkerd-sp-validator-k8s-tls
Optional: false
policy-tls:
Type: Secret (a volume populated by a Secret)
SecretName: linkerd-policy-validator-k8s-tls
Optional: false
linkerd-proxy-init-xtables-lock:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
linkerd-identity-token:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 86400
linkerd-identity-end-entity:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
kube-api-access-j94nv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m55s default-scheduler Successfully assigned linkerd/linkerd-destination-56db447bcf-klhfn to ip-172-31-25-125
Normal Pulled 2m56s kubelet Container image "cr.l5d.io/linkerd/proxy-init:v2.2.3" already present on machine
Normal Created 2m56s kubelet Created container linkerd-init
Normal Started 2m55s kubelet Started container linkerd-init
Warning FailedPostStartHook 54s kubelet PostStartHook failed
Normal Killing 54s kubelet FailedPostStartHook
Normal Pulled 24s kubelet Container image "cr.l5d.io/linkerd/controller:stable-2.14.10" already present on machine
Normal Created 24s kubelet Created container destination
Normal Started 24s kubelet Started container destination
Normal Pulled 24s kubelet Container image "cr.l5d.io/linkerd/controller:stable-2.14.10" already present on machine
Normal Created 24s kubelet Created container sp-validator
Normal Started 24s kubelet Started container sp-validator
Normal Pulled 24s kubelet Container image "cr.l5d.io/linkerd/policy-controller:stable-2.14.10" already present on machine
Normal Created 24s kubelet Created container policy
Normal Started 24s kubelet Started container policy
Warning Unhealthy 23s kubelet Readiness probe failed: Get "http://10.42.1.13:9996/ready": dial tcp 10.42.1.13:9996: connect: connection refused
Warning Unhealthy 23s kubelet Readiness probe failed: Get "http://10.42.1.13:9997/ready": dial tcp 10.42.1.13:9997: connect: connection refused
Normal Pulled 23s (x2 over 2m54s) kubelet Container image "cr.l5d.io/linkerd/proxy:stable-2.14.10" already present on machine
Normal Created 23s (x2 over 2m54s) kubelet Created container linkerd-proxy
Normal Started 23s (x2 over 2m54s) kubelet Started container linkerd-proxy
Warning Unhealthy 15s kubelet Liveness probe failed: Get "http://10.42.1.13:9990/live": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 15s kubelet Readiness probe failed: Get "http://10.42.1.13:9997/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 15s kubelet Readiness probe failed: Get "http://10.42.1.13:9996/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2024-02-22T09:30:03.900982Z INFO linkerd_policy_controller: created Lease resource lease=Lease { metadata: ObjectMeta { annotations: None, cluster_name: None, creation_timestamp: Some(Time(2024-02-22T09:30:03Z)), deletion_grace_period_seconds: None, deletion_timestamp: None, finalizers: None, generate_name: None, generation: None, labels: Some({"linkerd.io/control-plane-component": "destination", "linkerd.io/control-plane-ns": "linkerd"}), managed_fields: Some([ManagedFieldsEntry { api_version: Some("coordination.k8s.io/v1"), fields_type: Some("FieldsV1"), fields_v1: Some(FieldsV1(Object {"f:metadata": Object {"f:labels": Object {"f:linkerd.io/control-plane-component": Object {}, "f:linkerd.io/control-plane-ns": Object {}}, "f:ownerReferences": Object {"k:{\"uid\":\"64bbc220-5618-4aab-a4a8-f52d51946eca\"}": Object {}}}})), manager: Some("policy-controller"), operation: Some("Apply"), time: Some(Time(2024-02-22T09:30:03Z)) }]), name: Some("policy-controller-write"), namespace: Some("linkerd"), owner_references: Some([OwnerReference { api_version: "apps/v1", block_owner_deletion: None, controller: Some(true), kind: "Deployment", name: "linkerd-destination", uid: "64bbc220-5618-4aab-a4a8-f52d51946eca" }]), resource_version: Some("3964"), self_link: None, uid: Some("75c0f6e5-1005-4c4c-8447-1858882d5c66") }, spec: Some(LeaseSpec { acquire_time: None, holder_identity: None, lease_duration_seconds: None, lease_transitions: None, renew_time: None }) }
2024-02-22T09:30:03.907787Z INFO grpc{port=8090}: linkerd_policy_controller: policy gRPC server listening addr=0.0.0.0:8090
Looks like there is no update on this issue? We are facing same issue in our organization
Please check if you have both TCP and UDP traffic allowed between nodes -- it was the problem in my case; I forgot about UDP and it was causing this issue.
Please check if you have both TCP and UDP traffic allowed between nodes -- it was the problem in my case; I forgot about UDP and it was causing this issue.
@piotrrojek we have the same problem..are you using EKS?
Please check if you have both TCP and UDP traffic allowed between nodes -- it was the problem in my case; I forgot about UDP and it was causing this issue.
I was facing slowness using eks + linkerd ha. For some reason when exposing k8s services
, it was taking about 12 seconds to finish. Your comment enlightened me to do the same test and after creating TCP and UDP allow between nodes, k8s service creation returned back for few milliseconds.
My case is not related with the issue itself, but I am writing down here for someone else that is having issues with slowness while creating services
.
@dverzolla This is... wow. Would you consider a doc PR explaining this? or can you tell me what you did so that I can update the docs? 🙂
@dverzolla This is... wow. Would you consider a doc PR explaining this? or can you tell me what you did so that I can update the docs? 🙂
@kflynn Actually I've created a forum post: https://linkerd.buoyant.io/t/eks-service-creation-taking-too-long-solved/527
Sure I can create the doc PR.
@dverzolla Great! Thanks on all counts! 🙂
I'm going to go ahead and close this issue, then – please tag me in the PR! 🙂
What is the issue?
I am facing issues with linkerd-destination and linkerd-proxy-injector. Pods that have linkerd-inject enabled are stuck in podInitializing state. I tried to restart the deployment of the destination and proxy-injector, sometimes the pods managed to be deployed. But after that, subsequent pods will be stuck at podInitializing state. My nodes have sufficient CPU and memory space.
When I run linkerd check, there are no errors as well.
How can it be reproduced?
install linked via cli on EKS.
Logs, error output, etc
Linkerd-destination logs: kubectl -n linkerd logs deploy/linkerd-destination Defaulted container "linkerd-proxy" out of: linkerd-proxy, destination, sp-validator, policy, linkerd-init (init) [ 0.003199s] INFO ThreadId(01) linkerd2_proxy: release 2.210.4 (5a910be) by linkerd on 2023-11-22T17:01:46Z [ 0.004399s] INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime [ 0.005481s] INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191 [ 0.005534s] INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143 [ 0.005539s] INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140 [ 0.005543s] INFO ThreadId(01) linkerd2_proxy: Tap interface on 0.0.0.0:4190 [ 0.005547s] INFO ThreadId(01) linkerd2_proxy: Local identity is linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local [ 0.005552s] INFO ThreadId(01) linkerd2_proxy: Identity verified via linkerd-identity-headless.linkerd.svc.cluster.local:8080 (linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local) [ 0.005556s] INFO ThreadId(01) linkerd2_proxy: Destinations resolved via localhost:8086 [ 0.007430s] WARN ThreadId(01) watch{port=4191}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 0.036092s] INFO ThreadId(02) daemon:identity: linkerd_app: Certified identity id=linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local [ 0.112753s] WARN ThreadId(01) watch{port=4191}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 0.326477s] WARN ThreadId(01) watch{port=4191}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 0.741192s] WARN ThreadId(01) watch{port=4191}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 1.242244s] WARN ThreadId(01) watch{port=4191}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 1.743246s] WARN ThreadId(01) watch{port=4191}:controller{addr=localhost:8090}:endpoint{addr=127.0.0.1:8090}: linkerd_reconnect: Failed to connect error=endpoint 127.0.0.1:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)]
Linkerd-proxy-injector logs: Found 2 pods, using pod/linkerd-proxy-injector-757d76768f-h9bw2 Defaulted container "linkerd-proxy" out of: linkerd-proxy, proxy-injector, linkerd-init (init) [ 0.002738s] INFO ThreadId(01) linkerd2_proxy: release 2.210.4 (5a910be) by linkerd on 2023-11-22T17:01:46Z [ 0.003994s] INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime [ 0.004994s] INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191 [ 0.005012s] INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143 [ 0.005016s] INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140 [ 0.005020s] INFO ThreadId(01) linkerd2_proxy: Tap interface on 0.0.0.0:4190 [ 0.005023s] INFO ThreadId(01) linkerd2_proxy: Local identity is linkerd-proxy-injector.linkerd.serviceaccount.identity.linkerd.cluster.local [ 0.005026s] INFO ThreadId(01) linkerd2_proxy: Identity verified via linkerd-identity-headless.linkerd.svc.cluster.local:8080 (linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local) [ 0.005029s] INFO ThreadId(01) linkerd2_proxy: Destinations resolved via linkerd-dst-headless.linkerd.svc.cluster.local:8086 (linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local) [ 0.022461s] INFO ThreadId(02) daemon:identity: linkerd_app: Certified identity id=linkerd-proxy-injector.linkerd.serviceaccount.identity.linkerd.cluster.local [ 14.050531s] WARN ThreadId(01) watch{port=4191}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The operation completed successfully grpc.message="stream ended" [ 14.050577s] WARN ThreadId(01) watch{port=8443}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The operation completed successfully grpc.message="stream ended" [ 14.050593s] WARN ThreadId(01) watch{port=9995}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The operation completed successfully grpc.message="stream ended" [ 14.050622s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.130:8090}: linkerd_reconnect: Service failed error=endpoint 172.25.1.130:8090: channel closed error.sources=[channel closed] [ 14.158073s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.130:8090}: linkerd_reconnect: Failed to connect error=endpoint 172.25.1.130:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 14.363576s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.130:8090}: linkerd_reconnect: Failed to connect error=endpoint 172.25.1.130:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 14.784335s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.130:8090}: linkerd_reconnect: Failed to connect error=endpoint 172.25.1.130:8090: Connection refused (os error 111) error.sources=[Connection refused (os error 111)] [ 749.680863s] WARN ThreadId(01) watch{port=8443}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The operation completed successfully grpc.message="stream ended" [ 749.680916s] WARN ThreadId(01) watch{port=4191}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The operation completed successfully grpc.message="stream ended" [ 749.680934s] WARN ThreadId(01) watch{port=9995}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The operation completed successfully grpc.message="stream ended" [ 750.683751s] WARN ThreadId(01) watch{port=8443}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The service is currently unavailable grpc.message="client 172.25.1.64:34186: server: 172.25.1.82:8090: server 172.25.1.82:8090: service linkerd-policy.linkerd.svc.cluster.local:8090: service in fail-fast" [ 750.683879s] WARN ThreadId(01) watch{port=4191}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The service is currently unavailable grpc.message="client 172.25.1.64:34186: server: 172.25.1.82:8090: server 172.25.1.82:8090: service linkerd-policy.linkerd.svc.cluster.local:8090: service in fail-fast" [ 750.683981s] WARN ThreadId(01) watch{port=9995}: linkerd_app_inbound::policy::api: Unexpected policy controller response; retrying with a backoff grpc.status=The service is currently unavailable grpc.message="client 172.25.1.64:34186: server: 172.25.1.82:8090: server 172.25.1.82:8090: service linkerd-policy.linkerd.svc.cluster.local:8090: service in fail-fast" [ 750.791438s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.82:8090}: linkerd_reconnect: Service failed error=endpoint 172.25.1.82:8090: channel closed error.sources=[channel closed] [ 751.905842s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.82:8090}: linkerd_reconnect: Failed to connect error=endpoint 172.25.1.82:8090: connect timed out after 1s error.sources=[connect timed out after 1s] [ 753.115467s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.82:8090}: linkerd_reconnect: Failed to connect error=endpoint 172.25.1.82:8090: connect timed out after 1s error.sources=[connect timed out after 1s] [ 754.546487s] WARN ThreadId(01) watch{port=4191}:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=172.25.1.82:8090}: linkerd_reconnect: Failed to connect error=endpoint 172.25.1.82:8090: connect timed out after 1s error.sources=[connect timed out after 1s]
Control-plane metrics logs: #
POD linkerd-proxy-injector-77b4777668-k5z8b (13 of 13)
ERROR Get "http://localhost:46023/metrics": EOF
output of
linkerd check -o short
linkerd-version
‼ can determine the latest version Get "https://versioncheck.linkerd.io/version.json?version=stable-2.14.5&uuid=89015292-fa49-4de5-90b4-280126337b83&source=cli": net/http: TLS handshake timeout see https://linkerd.io/2.14/checks/#l5d-version-latest for hints ‼ cli is up-to-date unsupported version channel: stable-2.14.5 see https://linkerd.io/2.14/checks/#l5d-version-cli for hints
control-plane-version
‼ control plane is up-to-date unsupported version channel: stable-2.14.5 see https://linkerd.io/2.14/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
\ pod "linkerd-destination-765b97b794-zwn27" status is Running
Environment
EKS 1.27 Client version: stable-2.14.5 Server version: stable-2.14.5
Installation via CLI commands.
Possible solution
My suspicion is that maybe the service is not assigned local clusterIP address. When I run k describe svc linkerd-policy -n linkerd, Name: linkerd-policy Namespace: linkerd Labels: linkerd.io/control-plane-component=destination linkerd.io/control-plane-ns=linkerd Annotations: linkerd.io/created-by: linkerd/cli stable-2.14.5 Selector: linkerd.io/control-plane-component=destination Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: None IPs: None Port: grpc 8090/TCP TargetPort: 8090/TCP Endpoints: 172.25.1.195:8090 Session Affinity: None Events:
Somehow the linkerd-policy endpoints are not using clusterIP but rather the CIDR of the subnet.
Additional context
Currently, i have to keep restarting the deploy of destination and proxy-injector, somehow after X tries , my pods can run. Has anybody successfully installed linkerd on EKS?
Would you like to work on fixing this bug?
no