Some triggered Tekton jobs should have resource requests/limits

tektoncd / plumbing

This repo holds configuration for infrastructure used across the tektoncd org 🏗️

Apache License 2.0

60 stars 110 forks source link

Some triggered Tekton jobs should have resource requests/limits #1122

Open abayer opened 2 years ago

abayer commented 2 years ago

The ones I notice right now are the plumbing-image-build and pull-pipeline-kind-k8s-v1-21-e2e PR PipelineRuns, and the build-and-push-test-runner cronjob triggered PipelineRun. I've seen the test-runner image builds cause OOMs on their nodes, and the plumbing-image-build one I'm looking at right now is at over 5gb memory used. The pull-pipeline-kind-k8s-v1-21-e2e pods that I've seen have ranged between 2 and 4gb memory used.

None of them (or any of the other Tekton PipelineRuns, for that matter) have any requests or limits configured, so they can end up on the same node, or a node with one of the other high memory usage pods always running in the cluster (i.e., prometheus and kafka) and cause problems. Given that dogfooding is hardcoded to 5 n1-standard-4s, with ~13gb allocatable memory, it's pretty easy for just a few of the high memory pods to end up on the same node and swamp it.

tekton-robot commented 1 year ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot commented 1 year ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten with a justification. Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot commented 1 year ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen with a justification. Mark the issue as fresh with /remove-lifecycle rotten with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

tekton-robot commented 1 year ago

@tekton-robot: Closing this issue.

In response to [this](https://github.com/tektoncd/plumbing/issues/1122#issuecomment-1383223352): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen` with a justification. >Mark the issue as fresh with `/remove-lifecycle rotten` with a justification. >If this issue should be exempted, mark the issue as frozen with `/lifecycle frozen` with a justification. > >/close > >Send feedback to [tektoncd/plumbing](https://github.com/tektoncd/plumbing). Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

afrittoli commented 1 year ago

/remove-lifecycle rotten

afrittoli commented 1 year ago

/lifecycle frozen