openshift / cluster-kube-apiserver-operator

The kube-apiserver operator installs and maintains the kube-apiserver on a cluster
Apache License 2.0
74 stars 159 forks source link

[release-4.15] OCPBUGS-33315: add SNO control plane high cpu usage alert #1673

Closed qJkee closed 6 months ago

qJkee commented 6 months ago

Create separate SNO alert for control plane high cpu usage. This alert is also aware of workload partitioning and adjusts threshold if workload partitioning is enabled.

(cherry picked from commit d92b46166259ef43918e9f7cd6bc62f09a806d1f)

openshift-ci[bot] commented 6 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: qJkee Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/openshift/cluster-kube-apiserver-operator/blob/release-4.15/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci-robot commented 6 months ago

@qJkee: This pull request references Jira Issue OCPBUGS-33315, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/cluster-kube-apiserver-operator/pull/1673): >Create separate SNO alert for control plane high cpu usage. This alert is also aware of workload partitioning and adjusts threshold if workload partitioning is enabled. > >(cherry picked from commit d92b46166259ef43918e9f7cd6bc62f09a806d1f) Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-kube-apiserver-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
qJkee commented 6 months ago

/jira refresh

openshift-ci-robot commented 6 months ago

@qJkee: This pull request references Jira Issue OCPBUGS-33315, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.15.z) matches configured target version for branch (4.15.z) * bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST) * release note text is set and does not match the template * dependent bug [Jira Issue OCPBUGS-22117](https://issues.redhat.com//browse/OCPBUGS-22117) is in the state Closed (Done), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA)) * dependent [Jira Issue OCPBUGS-22117](https://issues.redhat.com//browse/OCPBUGS-22117) targets the "4.16.0" version, which is one of the valid target versions: 4.16.0 * bug has dependents

Requesting review from QA contact: /cc @wangke19

In response to [this](https://github.com/openshift/cluster-kube-apiserver-operator/pull/1673#issuecomment-2096320422): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-kube-apiserver-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
qJkee commented 6 months ago

/retest

openshift-ci[bot] commented 6 months ago

@qJkee: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-operator-disruptive-single-node 5e0c76d4b5859e6eab4d426db1d4cdd164d719a0 link false /test e2e-aws-operator-disruptive-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
wangke19 commented 6 months ago

/hold Pre-merge testing.

wangke19 commented 6 months ago

The associated bug has been pre-merge tested, got the expected results, detail see https://issues.redhat.com/browse/OCPBUGS-33315?focusedId=24685806&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-24685806

wangke19 commented 6 months ago

/label qe-approved

openshift-ci-robot commented 6 months ago

@qJkee: This pull request references Jira Issue OCPBUGS-33315, which is valid.

7 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.15.z) matches configured target version for branch (4.15.z) * bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST) * release note text is set and does not match the template * dependent bug [Jira Issue OCPBUGS-22117](https://issues.redhat.com//browse/OCPBUGS-22117) is in the state Closed (Done), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA)) * dependent [Jira Issue OCPBUGS-22117](https://issues.redhat.com//browse/OCPBUGS-22117) targets the "4.16.0" version, which is one of the valid target versions: 4.16.0 * bug has dependents

Requesting review from QA contact: /cc @wangke19

In response to [this](https://github.com/openshift/cluster-kube-apiserver-operator/pull/1673): >Create separate SNO alert for control plane high cpu usage. This alert is also aware of workload partitioning and adjusts threshold if workload partitioning is enabled. > >(cherry picked from commit d92b46166259ef43918e9f7cd6bc62f09a806d1f) Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-kube-apiserver-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
wangke19 commented 6 months ago

/unhold

wangke19 commented 6 months ago

/assign @sanchezl

wangke19 commented 6 months ago

/assign @p0lyn0mial

tkashem commented 6 months ago

/label backport-risk-assessed

wangke19 commented 6 months ago

/label cherry-pick-approved

vrutkovs commented 6 months ago

/hold

We should have spotted these issues in 4.16 PR, but at least lets not break 4.15

p0lyn0mial commented 6 months ago

/hold

until a new PR will be merged (xref: https://github.com/openshift/cluster-kube-apiserver-operator/pull/1674)

openshift-ci-robot commented 6 months ago

@qJkee: This pull request references Jira Issue OCPBUGS-33315. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state.

In response to [this](https://github.com/openshift/cluster-kube-apiserver-operator/pull/1673): >Create separate SNO alert for control plane high cpu usage. This alert is also aware of workload partitioning and adjusts threshold if workload partitioning is enabled. > >(cherry picked from commit d92b46166259ef43918e9f7cd6bc62f09a806d1f) Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-kube-apiserver-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.