cilium / cilium

eBPF-based Networking, Security, and Observability
https://cilium.io
Apache License 2.0
19.71k stars 2.88k forks source link

Cilium Envoy Fails to Re-establish Broken Upstream Connections with Active Downstream Connections #33880

Open isala404 opened 1 month ago

isala404 commented 1 month ago

Is there an existing issue for this?

Version

higher than v1.14.13 and lower than v1.15.0

What happened?

We've encountered an issue with Cilium Envoy in our Kubernetes environment. Our setup consists of a request chain involving three services:

  1. Ingress controller
  2. Envoy proxy for API management
  3. Workload pod

We're using Cilium proxy annotations for the workload container port, configured as ingress HTTP, primarily for observability purposes.

The problem occurs when the workload pod restarts. Immediately after the restart, the next request receives an error: upstream connect error or disconnect/reset before headers. reset reason: connection termination (HTTP 503).

Upon investigation, we discovered the following:

  1. Cilium Envoy acts as an intermediary, creating two connections: one to the downstream API management Envoy and another to the upstream workload pod.

  2. When the workload pod terminates, it emits a FIN packet. Cilium Envoy consumes this packet but doesn't terminate the downstream connection it created in relation to the upstream pod. This connection remains open until it hits the idle timeout.

  3. If a new request is sent immediately after the workload pod restart, our Envoy attempts to reuse the existing connection, resulting in a 503 error.

  4. The initial failed request causes the connection to terminate, allowing subsequent requests to succeed.

  5. However, we noticed that with kubeproxy replacement enabled, even the subsequent requests continue to fail until we restart our Envoy to break the connection.

Hubble outputs

With kube proxy

Screenshot 2024-07-17 at 18 29 29

Without kube proxy

Screenshot 2024-07-17 at 18 53 29

How can we reproduce the issue?

  1. Run cilium install --version 1.14.13 on AKS cluster with BYOCNI.
  2. Deploy following two yaml files.
envoy.yaml ```yaml apiVersion: v1 kind: Namespace metadata: name: apim --- apiVersion: v1 kind: ConfigMap metadata: name: envoy-config namespace: apim data: envoy.yaml: | static_resources: listeners: - address: socket_address: address: 0.0.0.0 port_value: 8080 filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/" route: cluster: foo_service http_filters: - name: envoy.filters.http.router typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router clusters: - name: foo_service connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: foo_service endpoints: - lb_endpoints: - endpoint: address: socket_address: address: foo.bar.svc.cluster.local port_value: 8080 --- apiVersion: apps/v1 kind: Deployment metadata: name: envoy-proxy namespace: apim spec: replicas: 1 selector: matchLabels: app: envoy-proxy template: metadata: labels: app: envoy-proxy spec: containers: - name: envoy image: envoyproxy/envoy:v1.24.7 ports: - containerPort: 8080 volumeMounts: - name: envoy-config mountPath: /etc/envoy volumes: - name: envoy-config configMap: name: envoy-config --- apiVersion: v1 kind: Service metadata: name: envoy-proxy namespace: apim spec: selector: app: envoy-proxy ports: - protocol: TCP port: 8080 targetPort: 8080 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: envoy-ingress namespace: apim spec: ingressClassName: nginx rules: - host: envoy.cilium.internal http: paths: - path: / pathType: Prefix backend: service: name: envoy-proxy port: number: 8080 ```
workload.yaml ```yaml apiVersion: v1 kind: Namespace metadata: name: bar --- apiVersion: apps/v1 kind: Deployment metadata: name: foo namespace: bar spec: replicas: 1 selector: matchLabels: app: foo template: metadata: labels: app: foo annotations: policy.cilium.io/proxy-visibility: "," spec: containers: - name: debug-service image: ghcr.io/isala404/toolbox/debug-service:latest ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: foo namespace: bar spec: selector: app: foo ports: - protocol: TCP port: 8080 targetPort: 8080 ```
  1. Deploy a ingress controller.
  2. Run curl LB-IP/healthz -H "Host: envoy.cilium.internal".
  3. Run kubectl delete pods --all -n bar.
  4. Wait for ~20 seconds and rerun the curl.

Cilium Version

1.14.13

Kernel Version

5.15.0-1067-azure

Kubernetes Version

v1.28.10

Regression

Sysdump

With Kube Proxy

cilium-sysdump-20240717-182934.zip

Without Kube Proxy

cilium-sysdump-20240717-185333.zip

Relevant log output

No response

Anything else?

Issue was happening regardless of wireguard encryption being enabled or not.

Cilium Users Document

Code of Conduct

isala404 commented 1 month ago

I ran some additional tests to fully understand the behavior we're observing. One of these tests involved chaining two Envoy instances to see how they interact. From this test, I discovered that Envoy is not designed to send a FIN packet to the downstream service. Instead, when a new request arrives over an already established connection from downstream, but the connection to the upstream service has been terminated, Envoy should initiate a new request to the upstream backend.

Please review the following Hubble log from the two chained Envoy servers:

hubble log ``` ~> hubble observe -n default -f --denylist '{"destination_label":["k8s:k8s-app=kube-dns"]}' --denylist '{"source_label":["k8s:k8s-app=kube-dns"]}' --since 1s Jul 22 12:23:26.904: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: SYN) Jul 22 12:23:26.904: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) <- default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: SYN, ACK) Jul 22 12:23:26.904: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK) Jul 22 12:23:26.904: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:26.904: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) -> default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: SYN) Jul 22 12:23:26.904: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) <- default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: SYN, ACK) Jul 22 12:23:26.905: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) -> default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: ACK) Jul 22 12:23:26.905: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) -> default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:26.906: default/envoy-proxy-2-86d6b5f47c-fv5g6:56366 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: SYN) Jul 22 12:23:26.906: default/envoy-proxy-2-86d6b5f47c-fv5g6:56366 (ID:8738) <- default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: SYN, ACK) Jul 22 12:23:26.906: default/envoy-proxy-2-86d6b5f47c-fv5g6:56366 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK) Jul 22 12:23:26.906: default/envoy-proxy-2-86d6b5f47c-fv5g6:56366 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:26.926: default/envoy-proxy-2-86d6b5f47c-fv5g6:56366 (ID:8738) <- default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:26.926: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) <- default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:26.932: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) <- default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:31.926: default/envoy-proxy-2-86d6b5f47c-fv5g6:56366 (ID:8738) <- default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, FIN) Jul 22 12:23:31.927: default/envoy-proxy-2-86d6b5f47c-fv5g6:56366 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, FIN) Jul 22 12:23:49.260: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:57326 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: SYN) Jul 22 12:23:49.260: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:57326 (ID:12699) <- default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: SYN, ACK) Jul 22 12:23:49.260: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:57326 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK) Jul 22 12:23:49.260: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:57326 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:49.261: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) -> default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:49.261: default/envoy-proxy-2-86d6b5f47c-fv5g6:45852 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: SYN) Jul 22 12:23:49.261: default/envoy-proxy-2-86d6b5f47c-fv5g6:45852 (ID:8738) <- default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: SYN, ACK) Jul 22 12:23:49.262: default/envoy-proxy-2-86d6b5f47c-fv5g6:45852 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK) Jul 22 12:23:49.262: default/envoy-proxy-2-86d6b5f47c-fv5g6:45852 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:49.266: default/envoy-proxy-2-86d6b5f47c-fv5g6:45852 (ID:8738) <- default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:49.266: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) <- default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:49.266: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:57326 (ID:12699) <- default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:23:54.268: default/envoy-proxy-2-86d6b5f47c-fv5g6:45852 (ID:8738) <- default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, FIN) Jul 22 12:23:54.268: default/envoy-proxy-2-86d6b5f47c-fv5g6:45852 (ID:8738) -> default/echo-service-6d6ddfb466-96h7c:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, FIN) # After running kubectl delete pod Jul 22 12:24:26.904: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, FIN) Jul 22 12:24:26.905: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) <- default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, FIN) Jul 22 12:24:26.905: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:32810 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK) Jul 22 12:24:46.752: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:57326 (ID:12699) -> default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:24:46.752: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) -> default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:24:46.753: default/envoy-proxy-2-86d6b5f47c-fv5g6:51646 (ID:8738) -> default/echo-service-6d6ddfb466-8t67w:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: SYN) Jul 22 12:24:46.753: default/envoy-proxy-2-86d6b5f47c-fv5g6:51646 (ID:8738) <- default/echo-service-6d6ddfb466-8t67w:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: SYN, ACK) Jul 22 12:24:46.753: default/envoy-proxy-2-86d6b5f47c-fv5g6:51646 (ID:8738) -> default/echo-service-6d6ddfb466-8t67w:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK) Jul 22 12:24:46.753: default/envoy-proxy-2-86d6b5f47c-fv5g6:51646 (ID:8738) -> default/echo-service-6d6ddfb466-8t67w:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:24:46.773: default/envoy-proxy-2-86d6b5f47c-fv5g6:51646 (ID:8738) <- default/echo-service-6d6ddfb466-8t67w:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:24:46.773: default/envoy-proxy-1-685ccc6b67-c5pdq:52522 (ID:346) <- default/envoy-proxy-2-86d6b5f47c-fv5g6:8000 (ID:8738) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:24:46.779: ingress-nginx/ingress-nginx-controller-597dc6d68-lbc74:57326 (ID:12699) <- default/envoy-proxy-1-685ccc6b67-c5pdq:8001 (ID:346) to-endpoint FORWARDED (TCP Flags: ACK, PSH) Jul 22 12:24:51.772: default/envoy-proxy-2-86d6b5f47c-fv5g6:51646 (ID:8738) <- default/echo-service-6d6ddfb466-8t67w:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, FIN) Jul 22 12:24:51.772: default/envoy-proxy-2-86d6b5f47c-fv5g6:51646 (ID:8738) -> default/echo-service-6d6ddfb466-8t67w:8080 (ID:55500) to-endpoint FORWARDED (TCP Flags: ACK, FIN) ```

As illustrated in the logs, Envoy-1 establishes a connection with Envoy-2 using port 52522. When the pod is deleted, the same port gets reused.

deploy.yaml ```yaml --- apiVersion: apps/v1 kind: Deployment metadata: name: echo-service spec: replicas: 1 selector: matchLabels: app: echo-service template: metadata: labels: app: echo-service spec: containers: - name: echo-service image: ghcr.io/isala404/toolbox/debug-service:latest ports: - containerPort: 8080 readinessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 5 periodSeconds: 10 livenessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 15 periodSeconds: 20 --- apiVersion: v1 kind: Service metadata: name: echo-service spec: selector: app: echo-service ports: - port: 8080 targetPort: 8080 --- apiVersion: v1 kind: ConfigMap metadata: name: envoy-config-2 data: envoy.yaml: | static_resources: listeners: - address: socket_address: address: 0.0.0.0 port_value: 8000 filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/" route: cluster: echo_service http_filters: - name: envoy.filters.http.router typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router clusters: - name: echo_service connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: echo_service endpoints: - lb_endpoints: - endpoint: address: socket_address: address: echo-service port_value: 8080 --- apiVersion: apps/v1 kind: Deployment metadata: name: envoy-proxy-2 spec: replicas: 1 selector: matchLabels: app: envoy-proxy-2 template: metadata: labels: app: envoy-proxy-2 spec: containers: - name: envoy-proxy image: envoyproxy/envoy:v1.28.4 # Cilium envoy version ports: - containerPort: 8000 volumeMounts: - name: envoy-config mountPath: /etc/envoy volumes: - name: envoy-config configMap: name: envoy-config-2 --- apiVersion: v1 kind: Service metadata: name: envoy-proxy-2 spec: selector: app: envoy-proxy-2 ports: - port: 8000 targetPort: 8000 --- apiVersion: v1 kind: ConfigMap metadata: name: envoy-config-1 data: envoy.yaml: | static_resources: listeners: - address: socket_address: address: 0.0.0.0 port_value: 8001 filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/" route: cluster: envoy_proxy_2 http_filters: - name: envoy.filters.http.router typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router clusters: - name: envoy_proxy_2 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: envoy_proxy_2 endpoints: - lb_endpoints: - endpoint: address: socket_address: address: envoy-proxy-2 port_value: 8000 --- apiVersion: apps/v1 kind: Deployment metadata: name: envoy-proxy-1 spec: replicas: 1 selector: matchLabels: app: envoy-proxy-1 template: metadata: labels: app: envoy-proxy-1 spec: containers: - name: envoy-proxy image: envoyproxy/envoy:v1.24.7 # APIM envoy version ports: - containerPort: 8001 volumeMounts: - name: envoy-config mountPath: /etc/envoy volumes: - name: envoy-config configMap: name: envoy-config-1 --- apiVersion: v1 kind: Service metadata: name: envoy-proxy-1 spec: selector: app: envoy-proxy-1 ports: - port: 8001 targetPort: 8001 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: envoy-ingress spec: ingressClassName: nginx rules: - host: your-domain.com http: paths: - path: / pathType: Prefix backend: service: name: envoy-proxy-1 port: number: 8001 ```