What is the issue?

A little over a month ago, we ran over the memory limit for linkerd-destination, and all three pods in the deployment began crashlooping, with bad results for the cluster as a whole. We increased the default memory to 500MiB, then after an alert to 750MiB, and yesterday to 1GiB.

Each line in the image is a separate instance of the destination container in the linkerd-destination deployment. Until October 26, we ran 6 instances; on Oct 6 we reduced it to 3, which increased the memory usage per pod slightly - which was expected.

On Nov 7 and Nov 10, something happened that triggered eviction for two out of the three pods, which in both cases used up a lot of memory in the destination container, which wasn't reclaimed even when the other two pods came back up.

I suspect the trigger for this in our case is related to Cluster Autoscaling. Basically, one pod is (was) co-located on a node that "never" went down, so it's been running for >30 days. Two other pods gets evicted occasionally, and while they're down, memory increases on the remaining pod by a lot, and never goes down.

Technical details:

Linkerd version v2.11.4
Cluster version GKE v1.22/1.23
about 275 services, 1500 pods

Some workarounds we've considered but not yet implemented:

increase the number of pods, to reduce the effect of a single one being removed.
change the rollout behaviour so it adds a pod before removing pods.
add a PodDisruptionBudget to ensure it always has three pods running.
set up a VerticalPodAutoscaler to increase the memory when nearing the limit - IIRC, this would effectively cause a new rollout, which would reduce the memory usage. Still needs to be kept under check with alerts etc for when you eventually hit the set limit.

I've also found it hard to find information about working around the problem. I started looking for recommendations on cpu/memory resource allocations and found... none. Not even in the Linkerd Production Runbook.

I then started looking for information on what drives memory usage in linkerd-destination, which was also rather limited. I have no idea if it scales up with:

number of services (stable growth)
number of pods (we have some batch jobs that occasionally can add a lot of pods)
number of service profile paths (We have considered auto-generating Service Profiles for a lot of API endpoints through code, which could have a disastrous effect if each of them adds 3-4 MB to linkerd-destination)
number of requester IP's (could have a disastrous effect if a public endpoint is DDoS'ed)
something else

I also have no idea why memory usage should increase when the number of linkerd-destination replicas are reduced - my expectation is that all pods should hold the same data to ensure any one of them could answer any queries, but that obviously isn't the case.

How can it be reproduced?

Run three linkerd-destination pods in a cluster with a lot of load.
Kill one of them and watch memory usage soar on the remaining pods.
Kill another one before the first one recovers, and watch memory usage soar on the remaining pod.
Rinse and repeat between these two pods until the first one runs out of memory.
Memory usage will now move to the remaining pods.
Eventually, you will end up in a situation where you go beyond the memory limit for a single pod while having just one other pod running.
Pod over limit will be OOMKilled.
Memory usage will move to the remaining pod, which will be OOMKilled.
All linkerd-destination is now down.
A new pod becoming ready will quickly run out of memory and get OOMKilled.

Logs, error output, etc

Each line in the image is a separate instance of the destination container in the linkerd-destination deployment. Until October 26, we ran 6 instances; on Oct 6 we reduced it to 3, which increased the memory usage per pod slightly - which was expected.

On Nov 7 and Nov 10, something happened that triggered eviction for two out of the three pods, which in both cases used up a lot of memory in the destination container, which wasn't reclaimed even when the other two pods came back up.

output of `linkerd check -o short`

Note: linkerd-multicluster is running fine, but it's not in the default namespace and I don't remember the command line flag for adding it off the top of my head.

Linkerd core checks

linkerd-version

‼ cli is up-to-date is running version 2.11.4 but the latest stable version is 2.12.2 see https://linkerd.io/2.11/checks/#l5d-version-cli for hints

control-plane-version

‼ control plane is up-to-date is running version 2.11.4 but the latest stable version is 2.12.2 see https://linkerd.io/2.11/checks/#l5d-version-control for hints

linkerd-control-plane-proxy

‼ control plane proxies are up-to-date some proxies are not running the current version:

linkerd-destination-76dd5fc787-9vw59 (stable-2.11.4)
linkerd-destination-76dd5fc787-kmzdv (stable-2.11.4)
linkerd-destination-76dd5fc787-pvzsl (stable-2.11.4)
linkerd-identity-5c4b745458-5z49s (stable-2.11.4)
linkerd-identity-5c4b745458-7rb6w (stable-2.11.4)
linkerd-identity-5c4b745458-x2xqm (stable-2.11.4)
linkerd-proxy-injector-54c7df5dff-9v9b2 (stable-2.11.4)
linkerd-proxy-injector-54c7df5dff-qj4mf (stable-2.11.4)
linkerd-proxy-injector-54c7df5dff-xlz68 (stable-2.11.4) see https://linkerd.io/2.11/checks/#l5d-cp-proxy-version for hints

Linkerd extensions checks

linkerd-multicluster

× remote cluster access credentials are valid

secret: [linkerd-multicluster/cluster-credentials-prod-2021]: secrets "cluster-credentials-prod-2021" not found see https://linkerd.io/2.11/checks/#l5d-smc-target-clusters-access for hints × clusters share trust anchors Problematic clusters:
- secret: [linkerd-multicluster/cluster-credentials-prod-2021]: secrets "cluster-credentials-prod-2021" not found see https://linkerd.io/2.11/checks/#l5d-multicluster-clusters-share-anchors for hints × service mirror controller has required permissions missing ServiceAccounts: linkerd-service-mirror-prod-2021 missing ClusterRoles: linkerd-service-mirror-access-local-resources-prod-2021 missing ClusterRoleBindings: linkerd-service-mirror-access-local-resources-prod-2021 missing Roles: linkerd-service-mirror-read-remote-creds-prod-2021 missing RoleBindings: linkerd-service-mirror-read-remote-creds-prod-2021 see https://linkerd.io/2.11/checks/#l5d-multicluster-source-rbac-correct for hints × service mirror controllers are running
no service mirror controller deployment for Link prod-2021 see https://linkerd.io/2.11/checks/#l5d-multicluster-service-mirror-running for hints × all gateway mirrors are healthy wrong number of (0) gateway metrics entries for probe-gateway-prod.linkerd-multicluster wrong number (0) of probe gateways for target cluster prod-2021 see https://linkerd.io/2.11/checks/#l5d-multicluster-gateways-endpoints for hints ‼ multicluster extension proxies are up-to-date some proxies are not running the current version:
- linkerd-gateway-5686d58b58-87gmt (stable-2.11.4)
- linkerd-gateway-5686d58b58-gk68q (stable-2.11.4)
- linkerd-gateway-5686d58b58-gz5fl (stable-2.11.4)
- linkerd-service-mirror-prod-8484d99f77-mjk25 (stable-2.11.4)
- linkerd-gateway-5686d58b58-87gmt (stable-2.11.4)
- linkerd-gateway-5686d58b58-gk68q (stable-2.11.4)
- linkerd-gateway-5686d58b58-gz5fl (stable-2.11.4)
- linkerd-service-mirror-prod-8484d99f77-mjk25 (stable-2.11.4) see https://linkerd.io/2.11/checks/#l5d-multicluster-proxy-cp-version for hints

Linkerd extensions checks

linkerd-multicluster

× remote cluster access credentials are valid

secret: [linkerd-multicluster/cluster-credentials-prod-2021]: secrets "cluster-credentials-prod-2021" not found see https://linkerd.io/2.11/checks/#l5d-smc-target-clusters-access for hints × clusters share trust anchors Problematic clusters:
- secret: [linkerd-multicluster/cluster-credentials-prod-2021]: secrets "cluster-credentials-prod-2021" not found see https://linkerd.io/2.11/checks/#l5d-multicluster-clusters-share-anchors for hints × service mirror controller has required permissions missing ServiceAccounts: linkerd-service-mirror-prod-2021 missing ClusterRoles: linkerd-service-mirror-access-local-resources-prod-2021 missing ClusterRoleBindings: linkerd-service-mirror-access-local-resources-prod-2021 missing Roles: linkerd-service-mirror-read-remote-creds-prod-2021 missing RoleBindings: linkerd-service-mirror-read-remote-creds-prod-2021 see https://linkerd.io/2.11/checks/#l5d-multicluster-source-rbac-correct for hints × service mirror controllers are running
no service mirror controller deployment for Link prod-2021 see https://linkerd.io/2.11/checks/#l5d-multicluster-service-mirror-running for hints × all gateway mirrors are healthy wrong number of (0) gateway metrics entries for probe-gateway-prod.linkerd-multicluster wrong number (0) of probe gateways for target cluster prod-2021 see https://linkerd.io/2.11/checks/#l5d-multicluster-gateways-endpoints for hints ‼ multicluster extension proxies are up-to-date some proxies are not running the current version:
- linkerd-gateway-5686d58b58-87gmt (stable-2.11.4)
- linkerd-gateway-5686d58b58-gk68q (stable-2.11.4)
- linkerd-gateway-5686d58b58-gz5fl (stable-2.11.4)
- linkerd-service-mirror-prod-8484d99f77-mjk25 (stable-2.11.4)
- linkerd-gateway-5686d58b58-87gmt (stable-2.11.4)
- linkerd-gateway-5686d58b58-gk68q (stable-2.11.4)
- linkerd-gateway-5686d58b58-gz5fl (stable-2.11.4)
- linkerd-service-mirror-prod-8484d99f77-mjk25 (stable-2.11.4) see https://linkerd.io/2.11/checks/#l5d-multicluster-proxy-cp-version for hints

Linkerd extensions checks

linkerd-viz

‼ viz extension proxies are up-to-date some proxies are not running the current version:

grafana-76db8c67d8-gb8br (stable-2.11.4)
metrics-api-69d6ccb89f-hnpb5 (stable-2.11.4)
tap-968446cc-65xgw (stable-2.11.4)
tap-968446cc-8fl46 (stable-2.11.4)
tap-968446cc-wrll2 (stable-2.11.4)
tap-injector-54dcdd9b77-lgdfl (stable-2.11.4)
web-5d76f77b46-6dsdr (stable-2.11.4) see https://linkerd.io/2.11/checks/#l5d-viz-proxy-cp-version for hints

Status check results are ×

Environment

GKE v1.22 and v1.23
Linkerd v2.11.4
Host OS: cos_containerd
Peak RPS on endpoints: ~2.5k

Possible solution

Some workarounds suggested above.

Other than that, I'm looking for more predictable behaviour on pod restart and a way to preemptively foresee what the growth will be.

Additional context

No response

Would you like to work on fixing this bug?

No response

linkerd / linkerd2

linkerd-destination pod memory usage spiking when other destination pods restart #9813

What is the issue?

How can it be reproduced?

Logs, error output, etc

output of `linkerd check -o short`

Linkerd core checks

linkerd-version

control-plane-version

linkerd-control-plane-proxy

Linkerd extensions checks

linkerd-multicluster

Linkerd extensions checks

linkerd-multicluster

Linkerd extensions checks

linkerd-viz

Environment

Possible solution

Additional context

Would you like to work on fixing this bug?

linkerd / linkerd2

linkerd-destination pod memory usage spiking when other destination pods restart #9813

What is the issue?

How can it be reproduced?

Logs, error output, etc

output of linkerd check -o short

Linkerd core checks

linkerd-version

control-plane-version

linkerd-control-plane-proxy

Linkerd extensions checks

linkerd-multicluster

Linkerd extensions checks

linkerd-multicluster

Linkerd extensions checks

linkerd-viz

Environment

Possible solution

Additional context

Would you like to work on fixing this bug?

output of `linkerd check -o short`