tektoncd / pipeline

A cloud-native Pipeline resource.
https://tekton.dev
Apache License 2.0
8.41k stars 1.77k forks source link

looks like Pipelines Controller is not freeing memory #7691

Open jhutar opened 6 months ago

jhutar commented 6 months ago

Expected Behavior

I would expect after some time Pipelines Controlled under load memory consumption would become constant - Pipelines Controller will start freeing memory.

Actual Behavior

This is a memory graph of a Pipelines Controller processing 10k very simple PipelineRuns with just one Task just printing "hello world" (Pipeline, PipelineRun)

These PRs were running from about 13:00 to to 15:30, script was creating new PRs in a way to make sure at max 100 of them is in pending/running:

image

Is this expected, or is this some sort of memory leak?

Steps to Reproduce the Problem

  1. Run 10k PipelineRuns and observe Pipelines Controller memory consumption over time
  2. This was automated in this repo with signing-tr-tekton-bigbang scenario

Additional Info

Cluster is gone already, but it was ROSA OpenShift 4.14.11 with 5 compute nodes AWS EC2 m6a.2xlarge

Reported this together with https://github.com/tektoncd/chains/issues/1058

vdemeester commented 6 months ago

@jhutar are the PipelineRun cleaned (deleted) from the cluster in that scenario ? Because if not, they will be "cached" by the informers and thus, they will be kept in memory up until they are deleted from the cluster (and informers cache updated).

jhutar commented 5 months ago

Yes, they are not deleted. So this means Pipelines Controller memory usage will just grow with number and size of PRs in the cluster. If there are excessively big PRs (maybe with long script), this might become a problem even for smaller number of PRs.

I understand this is property of underlying Go library (these informers you mentioned), but is there a way how to drop oldest records from the cache or so? Just to have a way how to keep memory usage flat.

vdemeester commented 5 months ago

I understand this is property of underlying Go library (these informers you mentioned), but is there a way how to drop oldest records from the cache or so? Just to have a way how to keep memory usage flat.

We need to explore this yes. There might be ways to "optimize" or filter some objects once they are done.