Closed amargherio closed 4 years ago
@amargherio thanks for the helpful report! if you still have the logs, were there any logs emitted from the linkerd-controller
pod's destination
container during this time?
@olix0r we don't have any logs from that window - looks like something may have restarted the linkerd-controller pods and rolled the logs over. The best we have is from this morning when everything restarted.
❯ kubectl get po -n linkerd
NAME READY STATUS RESTARTS AGE
linkerd-controller-656f575d6f-vvqd6 4/4 Running 4 5d
linkerd-grafana-6d45974d4d-7hcsx 2/2 Running 0 5d
linkerd-identity-d69b76885-crxn4 2/2 Running 0 5d
linkerd-prometheus-68796f556b-q4m9t 2/2 Running 1 5d
linkerd-proxy-injector-6c96c45845-jnz4m 2/2 Running 0 5d
linkerd-sp-validator-64776b6df4-9rqv4 2/2 Running 0 5d
linkerd-web-7db66f8555-crxr4 2/2 Running 0 5d
And a describe from those pods:
❯ kubectl describe po -n linkerd linkerd-controller-656f575d6f-vvqd6
Name: linkerd-controller-656f575d6f-vvqd6
Namespace: linkerd
Priority: 0
PriorityClassName: <none>
Node: aks-agentpool-13044130-7/172.19.0.11
Start Time: Fri, 07 Jun 2019 09:34:15 -0500
Labels: linkerd.io/control-plane-component=controller
linkerd.io/control-plane-ns=linkerd
linkerd.io/proxy-deployment=linkerd-controller
pod-template-hash=656f575d6f
Annotations: linkerd.io/created-by: linkerd/cli stable-2.3.2
linkerd.io/identity-mode: default
linkerd.io/proxy-version: stable-2.3.2
Status: Running
IP: 10.240.5.83
Controlled By: ReplicaSet/linkerd-controller-656f575d6f
Init Containers:
linkerd-init:
Container ID: docker://cd1ada61d36e1d4a1d62e3fcb5f07e7b625e4132ca85619c7d0e73ca0aa29ec8
Image: gcr.io/linkerd-io/proxy-init:stable-2.3.2
Image ID: docker-pullable://gcr.io/linkerd-io/proxy-init@sha256:8b6d6a3dd31586a895825517aac3bdd0fc66e3d945b933b971bc2c0695e8e3f7
Port: <none>
Host Port: <none>
Args:
--incoming-proxy-port
4143
--outgoing-proxy-port
4140
--proxy-uid
2102
--inbound-ports-to-ignore
4190,4191
--outbound-ports-to-ignore
443
State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 07 Jun 2019 09:34:24 -0500
Finished: Fri, 07 Jun 2019 09:34:24 -0500
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from linkerd-controller-token-nfbq9 (ro)
Containers:
public-api:
Container ID: docker://ee430a54231d4960e753cec86194d21714ea9c30d07f641e213e19c1ec74faa4
Image: gcr.io/linkerd-io/controller:stable-2.3.2
Image ID: docker-pullable://gcr.io/linkerd-io/controller@sha256:f6725d62e051be26bb6ad132d91aebe1859839aaad1b26de6a98b225cfd289ae
Ports: 8085/TCP, 9995/TCP
Host Ports: 0/TCP, 0/TCP
Args:
public-api
-prometheus-url=http://linkerd-prometheus.linkerd.svc.cluster.local:9090
-controller-namespace=linkerd
-log-level=info
State: Running
Started: Wed, 12 Jun 2019 09:25:16 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 07 Jun 2019 09:34:48 -0500
Finished: Wed, 12 Jun 2019 09:25:12 -0500
Ready: True
Restart Count: 1
Liveness: http-get http://:9995/ping delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9995/ready delay=0s timeout=1s period=10s #success=1 #failure=7
Environment:
KUBERNETES_PORT_443_TCP_ADDR: *kube-master-uri*
KUBERNETES_PORT: tcp://*kube-master-uri*:443
KUBERNETES_PORT_443_TCP: tcp://*kube-master-uri*:443
KUBERNETES_SERVICE_HOST: *kube-master-uri*
Mounts:
/var/run/linkerd/config from config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from linkerd-controller-token-nfbq9 (ro)
destination:
Container ID: docker://dfd40b3639033582547754b50c8255eb2f708faacd408326455d3a59d426a108
Image: gcr.io/linkerd-io/controller:stable-2.3.2
Image ID: docker-pullable://gcr.io/linkerd-io/controller@sha256:f6725d62e051be26bb6ad132d91aebe1859839aaad1b26de6a98b225cfd289ae
Ports: 8086/TCP, 9996/TCP
Host Ports: 0/TCP, 0/TCP
Args:
destination
-addr=:8086
-controller-namespace=linkerd
-enable-h2-upgrade=true
-log-level=info
State: Running
Started: Wed, 12 Jun 2019 09:25:18 -0500
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Fri, 07 Jun 2019 09:34:51 -0500
Finished: Wed, 12 Jun 2019 09:25:13 -0500
Ready: True
Restart Count: 1
Liveness: http-get http://:9996/ping delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9996/ready delay=0s timeout=1s period=10s #success=1 #failure=7
Environment:
KUBERNETES_PORT_443_TCP_ADDR: *kube-master-uri*
KUBERNETES_PORT: tcp://*kube-master-uri*:443
KUBERNETES_PORT_443_TCP: tcp://*kube-master-uri*:443
KUBERNETES_SERVICE_HOST: *kube-master-uri*
Mounts:
/var/run/linkerd/config from config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from linkerd-controller-token-nfbq9 (ro)
tap:
Container ID: docker://b46cd3074111cfe2f002f288a874afef171bd094dee86db443083f9bc7620268
Image: gcr.io/linkerd-io/controller:stable-2.3.2
Image ID: docker-pullable://gcr.io/linkerd-io/controller@sha256:f6725d62e051be26bb6ad132d91aebe1859839aaad1b26de6a98b225cfd289ae
Ports: 8088/TCP, 9998/TCP
Host Ports: 0/TCP, 0/TCP
Args:
tap
-controller-namespace=linkerd
-log-level=info
State: Running
Started: Wed, 12 Jun 2019 09:25:20 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 07 Jun 2019 09:34:55 -0500
Finished: Wed, 12 Jun 2019 09:25:14 -0500
Ready: True
Restart Count: 1
Liveness: http-get http://:9998/ping delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9998/ready delay=0s timeout=1s period=10s #success=1 #failure=7
Environment:
KUBERNETES_PORT_443_TCP_ADDR: *kube-master-uri*
KUBERNETES_PORT: tcp://*kube-master-uri*:443
KUBERNETES_PORT_443_TCP: tcp://*kube-master-uri*:443
KUBERNETES_SERVICE_HOST: *kube-master-uri*
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from linkerd-controller-token-nfbq9 (ro)
linkerd-proxy:
Container ID: docker://b6f982d378dc31acf592e00687edef3608a126bdf6f6ee9d07d619c683382ea9
Image: gcr.io/linkerd-io/proxy:stable-2.3.2
Image ID: docker-pullable://gcr.io/linkerd-io/proxy@sha256:9f28c47c3283a1a95d55a8e44613ac28f30c4fac0547abf1a42db6032b6a90a4
Ports: 4143/TCP, 4191/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Wed, 12 Jun 2019 09:25:11 -0500
Last State: Terminated
Reason: Error
Message: connect error to ControlAddr { addr: Name(NameAddr { name: "linkerd-identity.linkerd.svc.cluster.local", port: 8080 }), identity: Some("linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local") }: request timed out
ERR! [ 5.003895s] admin={bg=identity} linkerd2_proxy::app::identity Failed to certify identity: grpc-status: Unknown, grpc-message: "the request could not be dispatched in a timely fashion"
INFO [ 15.073362s] linkerd2_proxy::app::main Certified identity: linkerd-controller.linkerd.serviceaccount.identity.linkerd.cluster.local
WARN [ 25185.693883s] 10.240.8.18:4190 linkerd2_proxy::proxy::reconnect connect error to Endpoint { dst_name: None, addr: V4(10.240.8.18:4190), identity: None(NoPeerName(NoAuthorityInHttpRequest)), metadata: Metadata { weight: 10000, labels: {}, protocol_hint: Unknown, identity: None }, http_settings: Http2 }: Connection refused (os error 111) (address: 10.240.8.18:4190)
WARN [ 25186.695349s] proxy={server=out listen=127.0.0.1:4140 remote=10.240.5.83:38468} linkerd2_proxy::app::errors request aborted because it reached the configured dispatch deadline
WARN [ 25196.928696s] proxy={server=out listen=127.0.0.1:4140 remote=10.240.5.83:39414} linkerd2_proxy::app::errors request aborted because it reached the configured dispatch deadline
ERR! [431359.721130s] admin={server=admin listen=0.0.0.0:4191 remote=10.240.5.1:47740} linkerd2_proxy::control::serve_http error serving admin: Error(BodyWrite, Os { code: 32, kind: BrokenPipe, message: "Broken pipe" })
ERR! [431368.262313s] admin={server=admin listen=0.0.0.0:4191 remote=10.240.5.1:47924} linkerd2_proxy::control::serve_http error serving admin: Error(BodyWrite, Os { code: 32, kind: BrokenPipe, message: "Broken pipe" })
ERR! [431377.780425s] admin={server=admin listen=0.0.0.0:4191 remote=10.240.5.1:48124} linkerd2_proxy::control::serve_http error serving admin: Error(BodyWrite, Os { code: 32, kind: BrokenPipe, message: "Broken pipe" })
INFO [431377.987575s] linkerd2_proxy::signal received SIGTERM, starting shutdown
Exit Code: 137
Started: Fri, 07 Jun 2019 09:34:59 -0500
Finished: Wed, 12 Jun 2019 09:25:07 -0500
Ready: True
Restart Count: 1
Liveness: http-get http://:4191/metrics delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:4191/ready delay=2s timeout=1s period=10s #success=1 #failure=3
Environment:
LINKERD2_PROXY_LOG: warn,linkerd2_proxy=info
LINKERD2_PROXY_DESTINATION_SVC_ADDR: localhost.:8086
LINKERD2_PROXY_CONTROL_LISTEN_ADDR: 0.0.0.0:4190
LINKERD2_PROXY_ADMIN_LISTEN_ADDR: 0.0.0.0:4191
LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR: 127.0.0.1:4140
LINKERD2_PROXY_INBOUND_LISTEN_ADDR: 0.0.0.0:4143
LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES: svc.cluster.local.
LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE: 10000ms
LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE: 10000ms
_pod_ns: linkerd (v1:metadata.namespace)
LINKERD2_PROXY_DESTINATION_CONTEXT: ns:$(_pod_ns)
LINKERD2_PROXY_IDENTITY_DIR: /var/run/linkerd/identity/end-entity
LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS: -----BEGIN CERTIFICATE-----
MIIBgzCCASmgAwIBAgIBATAKBggqhkjOPQQDAjApMScwJQYDVQQDEx5pZGVudGl0
eS5saW5rZXJkLmNsdXN0ZXIubG9jYWwwHhcNMTkwNjA2MjAzOTI1WhcNMjAwNjA1
MjAzOTQ1WjApMScwJQYDVQQDEx5pZGVudGl0eS5saW5rZXJkLmNsdXN0ZXIubG9j
YWwwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAAQLtOijylDEWXYWb9xQpp28/G0b
ebmsqZttN/pRcHrxnbIcDHDq+WDe2Z98r2g1OtRz8epQ2ZQrPAtYob8QBuG+o0Iw
QDAOBgNVHQ8BAf8EBAMCAQYwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMC
MA8GA1UdEwEB/wQFMAMBAf8wCgYIKoZIzj0EAwIDSAAwRQIhALUH8XUyPLbzWD67
K66oyyplyHHvSihWTGAh5NZGSME3AiBJunpl/i07CULKhEoPGE5Qdtb31ey+t9jt
y4/IAZtzwQ==
-----END CERTIFICATE-----
LINKERD2_PROXY_IDENTITY_TOKEN_FILE: /var/run/secrets/kubernetes.io/serviceaccount/token
LINKERD2_PROXY_IDENTITY_SVC_ADDR: linkerd-identity.linkerd.svc.cluster.local:8080
_pod_sa: (v1:spec.serviceAccountName)
_l5d_ns: linkerd
_l5d_trustdomain: cluster.local
LINKERD2_PROXY_IDENTITY_LOCAL_NAME: $(_pod_sa).$(_pod_ns).serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)
LINKERD2_PROXY_IDENTITY_SVC_NAME: linkerd-identity.$(_l5d_ns).serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)
LINKERD2_PROXY_DESTINATION_SVC_NAME: linkerd-controller.$(_l5d_ns).serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)
KUBERNETES_PORT_443_TCP_ADDR: *kube-master-uri*
KUBERNETES_PORT: tcp://*kube-master-uri*:443
KUBERNETES_PORT_443_TCP: tcp://*kube-master-uri*:443
KUBERNETES_SERVICE_HOST: *kube-master-uri*
Mounts:
/var/run/linkerd/identity/end-entity from linkerd-identity-end-entity (rw)
/var/run/secrets/kubernetes.io/serviceaccount from linkerd-controller-token-nfbq9 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: linkerd-config
Optional: false
linkerd-identity-end-entity:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
linkerd-controller-token-nfbq9:
Type: Secret (a volume populated by a Secret)
SecretName: linkerd-controller-token-nfbq9
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 48m kubelet, aks-agentpool-13044130-7 Readiness probe failed: Get http://10.240.5.83:9998/ready: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 48m (x3 over 48m) kubelet, aks-agentpool-13044130-7 Liveness probe failed: Get http://10.240.5.83:4191/metrics: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 47m (x3 over 48m) kubelet, aks-agentpool-13044130-7 Liveness probe failed: Get http://10.240.5.83:9998/ping: dial tcp 10.240.5.83:9998: connect: connection refused
Warning Unhealthy 47m (x3 over 48m) kubelet, aks-agentpool-13044130-7 Readiness probe failed: Get http://10.240.5.83:9998/ready: dial tcp 10.240.5.83:9998: connect: connection refused
Warning Unhealthy 47m (x3 over 48m) kubelet, aks-agentpool-13044130-7 Liveness probe failed: Get http://10.240.5.83:9996/ping: dial tcp 10.240.5.83:9996: connect: connection refused
Warning Unhealthy 47m (x3 over 48m) kubelet, aks-agentpool-13044130-7 Readiness probe failed: Get http://10.240.5.83:9995/ready: dial tcp 10.240.5.83:9995: connect: connection refused
Warning Unhealthy 47m (x3 over 48m) kubelet, aks-agentpool-13044130-7 Readiness probe failed: Get http://10.240.5.83:9996/ready: dial tcp 10.240.5.83:9996: connect: connection refused
Warning Unhealthy 47m (x3 over 48m) kubelet, aks-agentpool-13044130-7 Liveness probe failed: Get http://10.240.5.83:9995/ping: dial tcp 10.240.5.83:9995: connect: connection refused
Normal Pulled 47m (x2 over 5d) kubelet, aks-agentpool-13044130-7 Container image "gcr.io/linkerd-io/proxy:stable-2.3.2" already present on machine
Normal Killing 47m kubelet, aks-agentpool-13044130-7 Killing container with id docker://linkerd-proxy:Container failed liveness probe.. Container will be killed and recreated.
Normal Created 47m (x2 over 5d) kubelet, aks-agentpool-13044130-7 Created container
As an update, we had this issue recur again on Friday evening (2019-Jun-14), causing us to do some widespread uninjection to restore this particular service and environment.
I was able to backtrack in Slack and run the dig
and nghttp
commands to check the linkerd-identity service - everything returned the expected results. The behavior we were seeing was traffic coming into our NGINX ingress controller, into the linkerd-proxy for the specific NGINX pod, and the proxy emitting the same "dispatched in a timely fashion" log message. We were also tailing logs from the backend service in question, and we saw no logged traffic or activity from the proxy or the application.
We restarted the application service, and it looked as though the proxy sidecar came up OK; it was able to query the identity service and had log output similar to every other successfully started sidecar.
I'll try to backtrack and see what might be left in our environment as far as logging output goes and will upload whatever I can find into this comment.
@amargherio normally when this happens, the discovery service has gotten out of sync from the api server. This has come up more than once in AKS because of how they're doing the connection (tunnelfront). If it happens again, bouncing the discovery service (or the whole control plane honestly) should fix the problem.
Now that I've said that, we've got some good steps to try replicating on AKS and see what the deal is.
@grampelberg that's about the only thing we hadn't tried to restart. We've actually got an ADR coming up with Microsoft and the recent API server instability we've seen plus concerns about tunnelfront's role in the cluster are already on the agenda.
If we can find anything else that may be relevant or if you have any questions for replicating, you can always drop a comment here or just mention me in Slack - more than happy to help!
@amargherio would you mind checking this out with 2.4 and maybe a newer AKS version? I'm hoping everything is fixed now!
@grampelberg we're getting a Linkerd rollout in place for our dev cluster. We should be able to report back on everything shortly.
Got the same issue, here's the detailed log:
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910233s] linkerd-destination.linkerd.svc.cluster.local:8086 dns=linkerd-destination.linkerd.svc.cluster.local trust_dns_resolver::name_server_pool polling response inner
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910266s] tokio_reactor event Writable Token(805306375)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910270s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910770s] tokio_reactor event Readable | Writable Token(805306375)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910780s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910821s] linkerd-destination.linkerd.svc.cluster.local:8086 dns=linkerd-destination.linkerd.svc.cluster.local trust_dns_resolver::name_server_pool polling response inner
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910835s] linkerd-destination.linkerd.svc.cluster.local:8086 dns=linkerd-destination.linkerd.svc.cluster.local trust_dns_proto::rr::record_data reading A
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910839s] linkerd-destination.linkerd.svc.cluster.local:8086 dns=linkerd-destination.linkerd.svc.cluster.local trust_dns_proto::udp::udp_client_stream received message id: 13746
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910842s] linkerd-destination.linkerd.svc.cluster.local:8086 dns=linkerd-destination.linkerd.svc.cluster.local mio::poll deregistering handle with poller
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910850s] linkerd-destination.linkerd.svc.cluster.local:8086 dns=linkerd-destination.linkerd.svc.cluster.local tokio_reactor dropping I/O source: 7
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910869s] tokio_reactor event Readable Token(4194303)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910871s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910885s] linkerd-destination.linkerd.svc.cluster.local:8086 linkerd2_proxy::transport::connect connecting to 172.20.118.59:8086
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910939s] linkerd-destination.linkerd.svc.cluster.local:8086 mio::poll registering with poller
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.910948s] tokio_reactor event Readable Token(4194303)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910951s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.910955s] tokio_reactor loop process - 0 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.912133s] tokio_reactor event Writable Token(809500679)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.912147s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.912157s] linkerd-destination.linkerd.svc.cluster.local:8086 linkerd2_proxy::transport::connect connection established to 172.20.118.59:8086
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.912178s] linkerd-destination.linkerd.svc.cluster.local:8086 linkerd2_proxy::transport::tls::client initiating TLS to linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.912197s] linkerd-destination.linkerd.svc.cluster.local:8086 rustls::client::hs No cached session for DNSNameRef("linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local")
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.912201s] linkerd-destination.linkerd.svc.cluster.local:8086 rustls::client::hs Not resuming any session
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.912237s] linkerd-destination.linkerd.svc.cluster.local:8086 rustls::client::hs Sending ClientHello Message {
myapp-web-55bcf8bb85-pnccj linkerd-proxy typ: Handshake,
myapp-web-55bcf8bb85-pnccj linkerd-proxy version: TLSv1_0,
myapp-web-55bcf8bb85-pnccj linkerd-proxy payload: Handshake(
myapp-web-55bcf8bb85-pnccj linkerd-proxy HandshakeMessagePayload {
myapp-web-55bcf8bb85-pnccj linkerd-proxy typ: ClientHello,
myapp-web-55bcf8bb85-pnccj linkerd-proxy payload: ClientHello(
myapp-web-55bcf8bb85-pnccj linkerd-proxy ClientHelloPayload {
myapp-web-55bcf8bb85-pnccj linkerd-proxy client_version: TLSv1_2,
myapp-web-55bcf8bb85-pnccj linkerd-proxy random: Random(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy 11,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 26,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 233,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 154,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 43,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 25,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 9,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 7,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 236,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 72,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 101,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 95,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 187,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 17,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 175,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 62,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 166,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 247,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 251,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 81,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 104,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 83,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 81,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 141,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 74,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 140,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 242,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 174,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 235,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 195,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 227,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 189,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy session_id: SessionID(
myapp-web-55bcf8bb85-pnccj linkerd-proxy 131,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 124,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 222,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 93,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 67,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 166,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 187,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 224,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 96,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 60,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 187,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 191,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 90,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 168,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 131,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 46,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 133,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 20,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 123,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 212,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 18,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 106,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 108,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 205,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 184,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 67,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 187,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 5,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 158,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 74,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 251,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 234,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy cipher_suites: [
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS13_CHACHA20_POLY1305_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS13_AES_256_GCM_SHA384,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS13_AES_128_GCM_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLS_EMPTY_RENEGOTIATION_INFO_SCSV,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy compression_methods: [
myapp-web-55bcf8bb85-pnccj linkerd-proxy Null,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy extensions: [
myapp-web-55bcf8bb85-pnccj linkerd-proxy SupportedVersions(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLSv1_3,
myapp-web-55bcf8bb85-pnccj linkerd-proxy TLSv1_2,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy ServerName(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy ServerName {
myapp-web-55bcf8bb85-pnccj linkerd-proxy typ: HostName,
myapp-web-55bcf8bb85-pnccj linkerd-proxy payload: HostName(
myapp-web-55bcf8bb85-pnccj linkerd-proxy DNSName(
myapp-web-55bcf8bb85-pnccj linkerd-proxy "linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local",
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy },
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy ECPointFormats(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy Uncompressed,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy NamedGroups(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy X25519,
myapp-web-55bcf8bb85-pnccj linkerd-proxy secp384r1,
myapp-web-55bcf8bb85-pnccj linkerd-proxy secp256r1,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy SignatureAlgorithms(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy ECDSA_NISTP384_SHA384,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ECDSA_NISTP256_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy RSA_PSS_SHA512,
myapp-web-55bcf8bb85-pnccj linkerd-proxy RSA_PSS_SHA384,
myapp-web-55bcf8bb85-pnccj linkerd-proxy RSA_PSS_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy RSA_PKCS1_SHA512,
myapp-web-55bcf8bb85-pnccj linkerd-proxy RSA_PKCS1_SHA384,
myapp-web-55bcf8bb85-pnccj linkerd-proxy RSA_PKCS1_SHA256,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy ExtendedMasterSecretRequest,
myapp-web-55bcf8bb85-pnccj linkerd-proxy CertificateStatusRequest(
myapp-web-55bcf8bb85-pnccj linkerd-proxy OCSP(
myapp-web-55bcf8bb85-pnccj linkerd-proxy OCSPCertificateStatusRequest {
myapp-web-55bcf8bb85-pnccj linkerd-proxy responder_ids: [],
myapp-web-55bcf8bb85-pnccj linkerd-proxy extensions: PayloadU16(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy },
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy KeyShare(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy KeyShareEntry {
myapp-web-55bcf8bb85-pnccj linkerd-proxy group: X25519,
myapp-web-55bcf8bb85-pnccj linkerd-proxy payload: PayloadU16(
myapp-web-55bcf8bb85-pnccj linkerd-proxy [
myapp-web-55bcf8bb85-pnccj linkerd-proxy 7,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 52,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 251,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 5,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 28,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 19,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 143,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 236,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 15,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 110,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 29,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 75,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 112,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 128,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 62,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 221,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 149,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 77,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 52,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 25,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 98,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 84,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 80,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 242,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 62,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 26,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 123,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 91,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 83,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 11,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 134,
myapp-web-55bcf8bb85-pnccj linkerd-proxy 114,
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy },
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy ],
myapp-web-55bcf8bb85-pnccj linkerd-proxy },
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy },
myapp-web-55bcf8bb85-pnccj linkerd-proxy ),
myapp-web-55bcf8bb85-pnccj linkerd-proxy }
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.912475s] tokio_reactor event Readable Token(4194303)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.912487s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.912494s] tokio_reactor loop process - 0 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.913925s] tokio_reactor event Readable | Writable Token(809500679)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.913944s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.913973s] linkerd-destination.linkerd.svc.cluster.local:8086 linkerd2_reconnect::service Failed to connect error=received corrupt message
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.913982s] linkerd-destination.linkerd.svc.cluster.local:8086 mio::poll deregistering handle with poller
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.914005s] linkerd-destination.linkerd.svc.cluster.local:8086 tokio_reactor dropping I/O source: 7
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.914025s] linkerd-destination.linkerd.svc.cluster.local:8086 linkerd2_reconnect::service Recovering
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 132.914035s] tokio_reactor event Readable Token(4194303)
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.914039s] tokio_reactor loop process - 1 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 132.914048s] tokio_reactor loop process - 0 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 133.187355s] tokio_reactor loop process - 0 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy DBUG [ 133.247475s] tokio_reactor loop process - 0 events, 0.000s
myapp-web-55bcf8bb85-pnccj linkerd-proxy WARN [ 133.247520s] linkerd2_proxy::app::profiles error fetching profile for myapp-web.myns.svc.cluster.local:80: Status { code: Unknown, message: "the request could not be dispatched in a timely fashion" }
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 133.247537s] linkerd2_proxy::proxy::http::canonicalize task idle; name=NameAddr { name: "myapp-web.myns.svc.cluster.local", port: 80 }
myapp-web-55bcf8bb85-pnccj linkerd-proxy TRCE [ 133.247559s] linkerd2_proxy::proxy::http::canonicalize task init; name=NameAddr { name: "myapp-web.myns.svc.cluster.local", port: 80 }
In my case:
Container settings for Linkerd in deployment:
- env:
- name: LINKERD2_PROXY_LOG
value: warn,linkerd2_proxy=info
- name: LINKERD2_PROXY_DESTINATION_SVC_ADDR
value: linkerd-destination.linkerd.svc.cluster.local:8086
- name: LINKERD2_PROXY_CONTROL_LISTEN_ADDR
value: 0.0.0.0:4190
- name: LINKERD2_PROXY_ADMIN_LISTEN_ADDR
value: 0.0.0.0:4191
- name: LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR
value: 127.0.0.1:4140
- name: LINKERD2_PROXY_INBOUND_LISTEN_ADDR
value: 0.0.0.0:4143
- name: LINKERD2_PROXY_DESTINATION_GET_SUFFIXES
value: svc.cluster.local.
- name: LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES
value: svc.cluster.local.
- name: LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE
value: 10000ms
- name: LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE
value: 10000ms
- name: _pod_ns
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: LINKERD2_PROXY_DESTINATION_CONTEXT
value: ns:$(_pod_ns)
- name: LINKERD2_PROXY_IDENTITY_DIR
value: /var/run/linkerd/identity/end-entity
- name: LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS
value: |
-----BEGIN CERTIFICATE-----
MIIBwDCCAWegAwIBAgIRAJVPu7CEdfnGNr6+hOc7Lg8wCgYIKoZIzj0EAwIwKTEn
...
bj+S13McFy/eoR7DFSyk9fOL4oA=
-----END CERTIFICATE-----
- name: LINKERD2_PROXY_IDENTITY_TOKEN_FILE
value: /var/run/secrets/kubernetes.io/serviceaccount/token
- name: LINKERD2_PROXY_IDENTITY_SVC_ADDR
value: linkerd-identity.linkerd.svc.cluster.local:8080
- name: _pod_sa
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.serviceAccountName
- name: _l5d_ns
value: linkerd
- name: _l5d_trustdomain
value: cluster.local
- name: LINKERD2_PROXY_IDENTITY_LOCAL_NAME
value: $(_pod_sa).$(_pod_ns).serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)
- name: LINKERD2_PROXY_IDENTITY_SVC_NAME
value: linkerd-identity.$(_l5d_ns).serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)
- name: LINKERD2_PROXY_DESTINATION_SVC_NAME
value: linkerd-destination.$(_l5d_ns).serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)
- name: LINKERD2_PROXY_TAP_SVC_NAME
value: linkerd-tap.$(_l5d_ns).serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)
image: gcr.io/linkerd-io/proxy:stable-2.6.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /metrics
port: 4191
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: linkerd-proxy
ports:
- containerPort: 4143
name: linkerd-proxy
protocol: TCP
- containerPort: 4191
name: linkerd-admin
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /ready
port: 4191
scheme: HTTP
initialDelaySeconds: 2
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "1"
memory: 250Mi
requests:
cpu: 100m
memory: 20Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 2102
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /var/run/linkerd/identity/end-entity
name: linkerd-identity-end-entity
linkerd check
is green.
Kubernetes:
Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.10-eks-825e5d", GitCommit:"825e5de08cb05714f9b224cd6c47d9514df1d1a7", GitTreeState:"clean", BuildDate:"2019-08-18T03:58:32Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
The situation happens to ~only one pod (from what I see right now)~ all pods. Related kubernetes service doesn't work.
curl myapp-web.myns.svc.cluster.local
returns: HTTP/1.1 503 Service Unavailable
while the application is working fine and pod is up and running. The hostname is resolvable and curl POD_IP
returns normal response.
In Linkerd Dashboard everything is green for this service, SR= 100.00% but I cannot contact the application using internal kubernetes DNS.
Looks similar to https://github.com/linkerd/linkerd2/issues/2970 but I don't use linkerd-cni
A full example of the problem:
root@myapp-web-6fc77c7465-kq5lf:/# nslookup myapp-web.myns.svc.cluster.local |grep Address
Address: 172.20.0.10#53
Address: 172.20.7.233
root@myapp-web-6fc77c7465-kq5lf:/# curl -v 172.20.7.233/health
* Trying 172.20.7.233...
* TCP_NODELAY set
* Connected to 172.20.7.233 (172.20.7.233) port 80 (#0)
> GET /health HTTP/1.1
> Host: 172.20.7.233
> User-Agent: curl/7.52.1
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Fri, 18 Oct 2019 11:58:34 GMT
<
* Curl_http_done: called premature == 0
* Connection #0 to host 172.20.7.233 left intact
default: PASSED Application is running (0s)
root@myapp-web-6fc77c7465-kq5lf:/# curl -v myapp-web.myns.svc.cluster.local/health
* Trying 172.20.7.233...
* TCP_NODELAY set
* Connected to myapp-web.myns.svc.cluster.local (172.20.7.233) port 80 (#0)
> GET /health HTTP/1.1
> Host: myapp-web.myns.svc.cluster.local
> User-Agent: curl/7.52.1
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< content-length: 0
< date: Fri, 18 Oct 2019 11:58:46 GMT
<
* Curl_http_done: called premature == 0
* Connection #0 to host myapp-web.myns.svc.cluster.local left intact
➜ stern -n myns myapp-web --tail 1 |grep WARN
+ myapp-web-6fc77c7465-kq5lf › myapp-web
+ myapp-web-6fc77c7465-kq5lf › linkerd-proxy
myapp-web-6fc77c7465-kq5lf linkerd-proxy myapp-web-6fc77c7465-kq5lf WARN [ 3404.009988s] linkerd2_proxy::app::profiles error fetching profile for myapp-web.myns.svc.cluster.local:80: Status { code: Unknown, message: "the request could not be dispatched in a timely fashion" }
myapp-web-6fc77c7465-kq5lf linkerd-proxy WARN [ 3412.012357s] linkerd2_proxy::app::profiles error fetching profile for myapp-web.myns.svc.cluster.local:80: Status { code: Unknown, message: "the request could not be dispatched in a timely fashion" }
Any idea how this can be fixed?
@grampelberg, I found that the problem first appeared in the logs after we upgraded Linkerd from edge-19.9.3 to stable 2.6.0. I downgraded it back and the problem immediately disappeared.
@KIVagant would you mind opening up a new issue for your problem? You're seeing something that is completely different than the original issue. I'm also having a tough time understanding what the problem is. Would you mind explaining a little bit more of the architecture and what exactly is going wrong (and where)?
I'm going to close this issue out, it seems like we've fixed the AKS tunnel-front issues for the time being.
hey @grampelberg is this fix included in the most recent edge release?
@uipo78 this issue was fixed quite awhile ago. If you're seeing behavior like this, please open a new issue up with all your details so that we can do some troubleshooting. @KIVagant has something else going on that we need to track down as well.
done deal
Bug Report
What is the issue?
After running a deploy via Spinnaker, we were alerted that traffic flow to the pod was completely cut off, returning a
502 Bad Gateway
for all inbound HTTP traffic.How can it be reproduced?
As of now, we've been unable to reproduce it. We ran
linkerd uninject
and redeployed the now-uninjected service. After re-injecting, the behavior was no longer present.Logs, error output, etc
The log output is repeated based on the block below - this was taken from one of the proxies running alongside NGINX:
linkerd check
outputEnvironment
Possible solution
We re-injected the problem pods before opening this issue and haven't experienced any of the failures seen in the past. We're monitoring to see if anything regresses, but as of now all workloads are stable - given that, it seems like an un-inject followed by a re-inject seems to work.
Additional context
Output from
kubectl exec -n linkerd -it -c linkerd-proxy linkerd-identity-7dcb854d79-nk76w -- dig linkerd-identity.linkerd.svc.cluster.local
: