linkerd / linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.
https://linkerd.io
Apache License 2.0
10.65k stars 1.28k forks source link

linkerd-destination pod memory usage spiking when other destination pods restart #9813

Closed rlnrln closed 1 year ago

rlnrln commented 1 year ago

What is the issue?

A little over a month ago, we ran over the memory limit for linkerd-destination, and all three pods in the deployment began crashlooping, with bad results for the cluster as a whole. We increased the default memory to 500MiB, then after an alert to 750MiB, and yesterday to 1GiB.

image

Each line in the image is a separate instance of the destination container in the linkerd-destination deployment. Until October 26, we ran 6 instances; on Oct 6 we reduced it to 3, which increased the memory usage per pod slightly - which was expected.

On Nov 7 and Nov 10, something happened that triggered eviction for two out of the three pods, which in both cases used up a lot of memory in the destination container, which wasn't reclaimed even when the other two pods came back up.

I suspect the trigger for this in our case is related to Cluster Autoscaling. Basically, one pod is (was) co-located on a node that "never" went down, so it's been running for >30 days. Two other pods gets evicted occasionally, and while they're down, memory increases on the remaining pod by a lot, and never goes down.

Technical details:

Some workarounds we've considered but not yet implemented:

I've also found it hard to find information about working around the problem. I started looking for recommendations on cpu/memory resource allocations and found... none. Not even in the Linkerd Production Runbook.

I then started looking for information on what drives memory usage in linkerd-destination, which was also rather limited. I have no idea if it scales up with:

I also have no idea why memory usage should increase when the number of linkerd-destination replicas are reduced - my expectation is that all pods should hold the same data to ensure any one of them could answer any queries, but that obviously isn't the case.

How can it be reproduced?

Logs, error output, etc

image

Each line in the image is a separate instance of the destination container in the linkerd-destination deployment. Until October 26, we ran 6 instances; on Oct 6 we reduced it to 3, which increased the memory usage per pod slightly - which was expected.

On Nov 7 and Nov 10, something happened that triggered eviction for two out of the three pods, which in both cases used up a lot of memory in the destination container, which wasn't reclaimed even when the other two pods came back up.

output of linkerd check -o short

Note: linkerd-multicluster is running fine, but it's not in the default namespace and I don't remember the command line flag for adding it off the top of my head.

Linkerd core checks

linkerd-version

‼ cli is up-to-date is running version 2.11.4 but the latest stable version is 2.12.2 see https://linkerd.io/2.11/checks/#l5d-version-cli for hints

control-plane-version

‼ control plane is up-to-date is running version 2.11.4 but the latest stable version is 2.12.2 see https://linkerd.io/2.11/checks/#l5d-version-control for hints

linkerd-control-plane-proxy

‼ control plane proxies are up-to-date some proxies are not running the current version:

Linkerd extensions checks

linkerd-multicluster

× remote cluster access credentials are valid

Linkerd extensions checks

linkerd-multicluster

× remote cluster access credentials are valid

Linkerd extensions checks

linkerd-viz

‼ viz extension proxies are up-to-date some proxies are not running the current version:

Status check results are ×

Environment

Possible solution

Some workarounds suggested above.

Other than that, I'm looking for more predictable behaviour on pod restart and a way to preemptively foresee what the growth will be.

Additional context

No response

Would you like to work on fixing this bug?

No response

kleimkuhler commented 1 year ago

There have been several changes merged recently that address destination controller memory leaks that could be caused by high Pod churn: #10013 and #10201. I'd encourage you to try the latest edge release; these fixes will also probably be included in a 2.12 patch release. I'm going to close for now since there has been little activity, but please reopen if you still experience these issues on more recent version of Linkerd.