Possible connection leak due to linkerd

Anthony-Bible commented 1 year ago

What is the issue?

One set of our pods had has a connection leak when doing intrapod (pod to pod) communication. This causes it to leave connections open until 2k concurrent connections where it then crashes linkerd-proxy due to memory limits. If you want to look at it yourself, do the following:

How can it be reproduced?

This appears to be reproducible in statefulsets with pod to pod communication when communicating via http

Logs, error output, etc

Please see screenshots below

output of `linkerd check -o short`

╭─◦ ~/neo_new/neo ‹system› ‹release/6.88.x*› ‹gcp: dev› ‹gke: dev-1› ‹kube: dev/›
╰─◦ linkerd check -o short
Linkerd core checks
===================

linkerd-version
---------------
‼ cli is up-to-date
    is running version 2.11.1 but the latest stable version is 2.12.2
    see https://linkerd.io/2.11/checks/#l5d-version-cli for hints

control-plane-version
---------------------
‼ control plane is up-to-date
    is running version 2.11.1 but the latest stable version is 2.12.2
    see https://linkerd.io/2.11/checks/#l5d-version-control for hints

linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
    some proxies are not running the current version:
    * linkerd-destination-f769d9859-dcj65 (stable-2.11.1)
    * linkerd-destination-f769d9859-j8ffc (stable-2.11.1)
    * linkerd-destination-f769d9859-z6wqm (stable-2.11.1)
    * linkerd-identity-9dffcdcd5-48rxj (stable-2.11.1)
    * linkerd-identity-9dffcdcd5-ksxrg (stable-2.11.1)
    * linkerd-identity-9dffcdcd5-w5gzl (stable-2.11.1)
    * linkerd-proxy-injector-65bbcd9f4f-5mbrq (stable-2.11.1)
    * linkerd-proxy-injector-65bbcd9f4f-5qxp9 (stable-2.11.1)
    * linkerd-proxy-injector-65bbcd9f4f-w6mff (stable-2.11.1)
    see https://linkerd.io/2.11/checks/#l5d-cp-proxy-version for hints

Status check results are √

Linkerd extensions checks
=========================

linkerd-viz
-----------
‼ viz extension proxies are up-to-date
    some proxies are not running the current version:
    * grafana-7c4bc49848-8d2q6 (stable-2.11.1)
    * metrics-api-558f7995cb-fphb9 (stable-2.11.1)
    * prometheus-74c888dd96-ll5wk (stable-2.11.1)
    * tap-7d5fbc784b-47r49 (stable-2.11.1)
    * tap-injector-64876f6f9f-r5fj7 (stable-2.11.1)
    * web-79984fd46c-2jnh9 (stable-2.11.1)
    see https://linkerd.io/2.11/checks/#l5d-viz-proxy-cp-versi


### Environment
Kubernetes version:

╭─◦ ╰─◦ k version Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.12", GitCommit:"696a9fdd2a58340e61e0d815c5769d266fca0802", GitTreeState:"clean", BuildDate:"2022-04-13T19:07:00Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.12-gke.1200", GitCommit:"26604476cf3bcb64bec486e4b5880a5773a52e75", GitTreeState:"clean", BuildDate:"2022-07-29T09:23:06Z", GoVersion:"go1.16.15b7", Compiler:"gc", Platform:"linux/amd64"}

Environment:
GKE

linkerd version:

╭─◦ ╰─◦ linkerd version Client version: stable-2.11.1 Server version: stable-2.11.1



### Additional context

This appears to have been introduced in 2.11, 2.10 tcp connections stay consistently low. 

2.11:
![connectionleak](https://user-images.githubusercontent.com/11566478/198745187-d347b8a6-c011-4fa4-b7c0-989fb36e97a7.png)

2.10:
![Screen Shot 2022-10-28 at 4 34 47 PM](https://user-images.githubusercontent.com/11566478/198745930-3ecb405f-a000-4d74-9b6a-fd33c33052b6.png)

### Would you like to work on fixing this bug?

_No response_

dwilliams782 commented 1 year ago

We've seen this with multiple of our applications as well, with no clear view of why.

Random example today where the proxy OOMd (250mb limit) for a newly onboarded service:

sum(tcp_open_connections{direction="inbound", peer="src",namespace="$namespace", pod=~"$pod", tls!="true", target_ip!="0.0.0.0"}) by (tls)

I raised a discussion on this topic a while back here. Interestingly, we also run GKE 1.22.12..

My understanding (which could well be wrong) is that there are 4 stages of connections:

external to local proxy
local proxy to local app
local app to local proxy
local proxy to external

We typically see connection leaks either from external to the local proxy, or the local app to the local proxy. I've never got any further with answers to why as we remove the affected applications from the mesh. I wonder if GKE is the culprit here; we began our Linkerd journey with 2.12 edge so I'm unable to comment on 2.10 behaviour.

adleong commented 1 year ago

Hi @Anthony-Bible. We'd definitely like to get to the bottom of this. You mentioned that this behavior is reproducible in statefulsets; can you provide steps that we can use to reproduce the connection leak so that we can investigate?

dwilliams782 commented 1 year ago

@adleong what can we do to help diagnose this, other than providing manifests? We see this a lot but it seems pretty isolated to java applications only, and I'm not sure how to isolate this outside of our GKE cluster.

I have been digging into one of our applications showing this behaviour, running a container that just spits out tcpdump to stdout. An example request without the application being meshed:

The connection is alive for ~2-300ms.

Once I enable Linkerd, the connection remains open indefinitely (I deleted the pod at 16:12 which closed the connections) with what looks to me like the proxy sending keepalive packets to the requesting pod every ~60 seconds:

And these are the debug logs showing this connection (note, nothing happens between the 16:07 and 16:12 when the pod terminated)

2022-11-11 16:07:35 | [  1265.687714s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_app_inbound::accept: Accepted policy=ServerPolicy { protocol: Detect { http: [Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: Some(Prefix("/")), headers: [], query_params: [], method: None }], policy: RoutePolicy { meta: Default { name: "default" }, authorizations: [Authorization { networks: [Network { net: 0.0.0.0/0, except: [] }, Network { net: ::/0, except: [] }], authentication: Unauthenticated, meta: Default { name: "all-unauthenticated" } }, Authorization { networks: [Network { net: 127.0.0.1/32, except: [] }, Network { net: ::1/128, except: [] }], authentication: Unauthenticated, meta: Default { name: "localhost" } }], filters: [] } }] }, Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: Some(Exact("/rest/actuator/health")), headers: [], query_params: [], method: Some(GET) }], policy: RoutePolicy { meta: Default { name: "probe" }, authorizations: [Authorization { networks: [Network { net: 0.0.0.0/0, except: [] }], authentication: Unauthenticated, meta: Default { name: "probe" } }, Authorization { networks: [Network { net: 0.0.0.0/0, except: [] }, Network { net: ::/0, except: [] }], authentication: Unauthenticated, meta: Default { name: "all-unauthenticated" } }, Authorization { networks: [Network { net: 127.0.0.1/32, except: [] }, Network { net: ::1/128, except: [] }], authentication: Unauthenticated, meta: Default { name: "localhost" } }], filters: [] } }] }], timeout: 10s, tcp_authorizations: [Authorization { networks: [Network { net: 0.0.0.0/0, except: [] }, Network { net: ::/0, except: [] }], authentication: Unauthenticated, meta: Default { name: "all-unauthenticated" } }, Authorization { networks: [Network { net: 127.0.0.1/32, except: [] }, Network { net: ::1/128, except: [] }], authentication: Unauthenticated, meta: Default { name: "localhost" } }] }, meta: Default { name: "all-unauthenticated" } }

2022-11-11 16:07:35 | [  1265.687929s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_tls::server: Peeked bytes from TCP stream sz=0

2022-11-11 16:07:35 | [  1265.687939s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_tls::server: Attempting to buffer TLS ClientHello after incomplete peek

2022-11-11 16:07:35 | [  1265.687946s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_tls::server: Reading bytes from TCP stream buf.capacity=8192

2022-11-11 16:07:35 | [  1265.687953s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_tls::server: Read bytes from TCP stream buf.len=788

2022-11-11 16:07:35 | [  1265.687970s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_detect: DetectResult protocol=Some(Http1) elapsed=6.571µs

2022-11-11 16:07:35 | [  1265.687985s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_proxy_http::server: Creating HTTP service version=Http1

2022-11-11 16:07:35 | [  1265.688002s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_proxy_http::server: Handling as HTTP version=Http1

2022-11-11 16:07:35 | [  1265.688046s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: hyper::proto::h1::io: parsed 17 headers

2022-11-11 16:07:35 | [  1265.688056s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: hyper::proto::h1::conn: incoming body is content-length (5546 bytes)

2022-11-11 16:07:35 | [  1265.688085s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}: linkerd_app_inbound::policy::http: Request authorized server.group= server.kind=default server.name=all-unauthenticated route.group= route.kind=default route.name=default authz.group= authz.kind=default authz.name=all-unauthenticated client.tls=None(NoClientHello) client.ip=10.225.12.14

2022-11-11 16:07:35 | [  1265.688160s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}:http{client.addr=10.225.12.14:38364 client.id="-" timestamp=2022-11-11T16:07:35.138025772Z method="POST" uri=/CollectorService version=HTTP/1.1 trace_id="" request_bytes="5546" user_agent="Java/1.8.0_345" host="<removed>:8080"}:http1{name=<removed>:8080}:profile: linkerd_service_profiles::http::proxy: Updating HTTP routes routes=0

2022-11-11 16:07:35 | [  1265.688171s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}:http{client.addr=10.225.12.14:38364 client.id="-" timestamp=2022-11-11T16:07:35.138025772Z method="POST" uri=/CollectorService version=HTTP/1.1 trace_id="" request_bytes="5546" user_agent="Java/1.8.0_345" host="<removed>:8080"}:http1{name=<removed>:8080}: tower::buffer::worker: service.ready=true processing request

2022-11-11 16:07:35 | [  1265.688181s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}:http{client.addr=10.225.12.14:38364 client.id="-" timestamp=2022-11-11T16:07:35.138025772Z method="POST" uri=/CollectorService version=HTTP/1.1 trace_id="" request_bytes="5546" user_agent="Java/1.8.0_345" host="<removed>:8080"}:http1{name=<removed>:8080}:profile:http1: linkerd_proxy_http::client: method=POST uri=<removed>:8080/CollectorService version=HTTP/1.1

2022-11-11 16:07:35 | [  1265.688185s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}:http{client.addr=10.225.12.14:38364 client.id="-" timestamp=2022-11-11T16:07:35.138025772Z method="POST" uri=/CollectorService version=HTTP/1.1 trace_id="" request_bytes="5546" user_agent="Java/1.8.0_345" host="<removed>:8080"}:http1{name=<removed>:8080}:profile:http1: linkerd_proxy_http::client: headers={"accept-encoding": "gzip", "accept": "text/xml, text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2", "soapaction": "\"BookingsCreate\"", "content-type": "text/xml; charset=utf-8", "cache-control": "no-cache", "pragma": "no-cache", "user-agent": "Java/1.8.0_345", "x-cloud-trace-context": "<removed>", "via": "1.1 google", "x-forwarded-for": "<removed>", "x-forwarded-proto": "https,https", "content-length": "5546", "forwarded": "<removed>", "x-forwarded-port": "443", "x-forwarded-host": "<removed>", "host": "<removed>", "traceparent": "<removed>"}

 2022-11-11 16:07:35 | [  1265.688244s] DEBUG ThreadId(02) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: hyper::proto::h1::conn: incoming body completed

2022-11-11 16:07:35 | [  1265.688293s] DEBUG ThreadId(02) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}:http{client.addr=10.225.12.14:38364 client.id="-" timestamp=2022-11-11T16:07:35.138025772Z method="POST" uri=/CollectorService version=HTTP/1.1 trace_id="" request_bytes="5546" user_agent="Java/1.8.0_345" host="<removed>:8080"}:http1{name=<removed>:8080}:profile:http1: linkerd_proxy_transport::connect: Connecting server.addr=10.225.1.52:8080

2022-11-11 16:07:35 | [  1265.688482s] DEBUG ThreadId(02) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}:http{client.addr=10.225.12.14:38364 client.id="-" timestamp=2022-11-11T16:07:35.138025772Z method="POST" uri=/CollectorService version=HTTP/1.1 trace_id="" request_bytes="5546" user_agent="Java/1.8.0_345" host="<removed>:8080"}:http1{name=<removed>:8080}:profile:http1: linkerd_proxy_transport::connect: Connected local.addr=10.225.1.52:51412 keepalive=None

2022-11-11 16:07:35 | [  1265.688501s] DEBUG ThreadId(02) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}:http{v=1.x}:http{client.addr=10.225.12.14:38364 client.id="-" timestamp=2022-11-11T16:07:35.138025772Z method="POST" uri=/CollectorService version=HTTP/1.1 trace_id="" request_bytes="5546" user_agent="Java/1.8.0_345" host="<removed>:8080"}:http1{name=<removed>:8080}:profile:http1: linkerd_transport_metrics::client: client connection open

2022-11-11 16:07:39 | [  1270.090523s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: hyper::proto::h1::io: flushed 979 bytes

2022-11-11 16:12:01 | [  1532.360750s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}:server{port=8080}: linkerd_proxy_http::server: The process is shutting down the connection Show context

2022-11-11 16:12:01 | [  1532.360807s] DEBUG ThreadId(03) inbound:accept{client.addr=10.225.12.14:38364}: linkerd_app_core::serve: Connection closed

Not every request to the pod results in an open connection, but every request from a particular java app does, only when Linkerd is enabled. I'm particularly interested in keepalive=None in the debug logs, anything you can share around this?

I would love to help get to the bottom of this as this affects multiple of our applications. If there are additional logs I can provide, or if you want me to run a modified version of the proxy for extra logs etc, I'm happy to help. I'm just not sure how to repro outside of our cluster!

dwilliams782 commented 1 year ago

@adleong should Linkerd proxy send FIN packets back to the app? In our testing app with a Java Netty client, we see FIN back from the server, but this FIN isn't sent back to the application so the connection provider doesn't see a closed event and the connection hangs.

adleong commented 1 year ago

should Linkerd proxy send FIN packets back to the app?

Not necessarily. Linkerd treats both connections (client to Linkerd and Linkerd to server) as separate and reuses connections when possible.

Based on what you're describing, it sound like what might be happening is:

Client opens a connection to Linkerd
Linkerd opens a connection to the server
Server closes the connection from Linkerd
Client still has an open connection to Linkerd, but opens a new one
As this repeats, it causes an accumulation of open connections between the client and Linkerd

Does this seem consistent with what you're seeing? If so, it strikes me as unusual behavior on the part of the client. I would expect the client to either re-use an existing open connection or to close a connection it was no longer interested in using.

dwilliams782 commented 1 year ago

Yes, that is consistent with what we are seeing. I had been under the impression each request spawns two linked connections (# 1 and # 2), not that # 1 can be kept alive separately.

it strikes me as unusual behavior on the part of the client

This is the default behaviour seen when using reactor.netty in Java. We see log entries like:

15:50:29.080 [reactor-http-epoll-4] DEBUG reactor.netty.resources.PooledConnectionProvider - [ac2a3620, L:/10.225.28.29:58770 - R:content-repository.tarantool.svc.cluster.local/10.224.29.105:80] Channel cleaned, now: 0 active connections, 1 inactive connections and 0 pending acquire requests.

Where the connection is marked inactive but persists until it receives a FIN/RST which doesn't happen until the proxy shuts down. We've found a potential workaround by setting evictionInterval:

static final Duration EVICT_IN_BACKGROUND_DISABLED       = Duration.ZERO;

Duration evictionInterval       = EVICT_IN_BACKGROUND_DISABLED;

        /**
         * Set the options to use for configuring {@link ConnectionProvider} background eviction.
         * When a background eviction is enabled, the connection pool is regularly checked for connections,
         * that are applicable for removal.
         * Default to {@link #EVICT_IN_BACKGROUND_DISABLED} - the background eviction is disabled.
         * Providing an {@code evictionInterval} of {@link Duration#ZERO zero} means the background eviction is disabled.
         * <p><strong>Note:</strong> This configuration is not applicable for {@link reactor.netty.tcp.TcpClient}.
         * A TCP connection is always closed and never returned to the pool.
         *
         * @param evictionInterval specifies the interval to be used for checking the connection pool, (resolution: ns)
         * @return {@literal this}
         */

but this requires rebuilding all of our apps with a custom connection pool, which is no small task. It also doesn't directly address the problem as it just culls all active connections after X period, so it's more brute force than elegant. I will also look into why these connections aren't being reused.

dwilliams782 commented 1 year ago

@Anthony-Bible is your app also using Java and Netty?

Anthony-Bible commented 1 year ago

Unfortunately not, ours is a python app we package up with pex.

Anthony-Bible commented 1 year ago

Sorry correction, it's golang so no wrappers. And FWIW it's connections that are pod to pod communication in the same statefulset.

SIGUSR2 commented 1 year ago

To add to @Anthony-Bible's case:

Our app relies on standard golang http:

Client init:

        client: http.Client{
            Transport: &http.Transport{
                MaxIdleConns:          10,
                IdleConnTimeout:       15 * time.Minute,
                ResponseHeaderTimeout: 10 * time.Second,
            },
            Timeout: 1 * time.Minute,
        },

Request (sync):

    req := &http.Request{
        Method: "DELETE",
        URL:    parsedURL,
        Header: map[string][]string{
            "X-Copy": {"true"},
        },
    }
        [...]
    resp, err := client.Do(req)
        [...]
        defer resp.Body.Close()

Request (async):

    req := &http.Request{
        Method:        "PUT",
        URL:           parsedURL,
        ContentLength: contentLength,
        Header: map[string][]string{
            "Content-Type":        {"application/octet-stream"},
            "X-Copy": {"true"},
        },
        TransferEncoding: []string{},

        Body:  pipeReader,
    }
        [...]
    go func(req *http.Request) {
        res, err := client.Do(req)

        if res != nil {
            if err != nil {
                panic(err)
            }
                        [...]
            defer res.Body.Close()
        }
    }(req)

We tested config.linkerd.io/opaque-ports: "8080" (main listening port of the app); in our case, things are looking much better:

Inbound:

Outbound:

This was tested in a very small environment, but same tests executed on Friday generated the majority of the connections seen to the left of the graphs.

dwilliams782 commented 1 year ago

After months of agonising, we finally found the culprit for our issue with our Java apps. We are using Reactor Netty and otel-agent v1.9.0, we found an open bug in otel-agent that causes a new connection pool to be opened for each request, which never gets closed. We've upgraded otel and this has gone away. https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/4862

Sadly not relevant to your issue but wanted to follow up in case anyone stumbles upon this and sees my previous responses.

Anthony-Bible commented 1 year ago

Just wanted to clarify, @SIGUSR2 is my coworker so that's the solution we're currently using as a workaround

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

linkerd / linkerd2