openshift / machine-config-operator

Apache License 2.0
245 stars 402 forks source link

OCPBUGS-37850: Machine-config daemon ListPools panic during tech-preview CI runs #4533

Closed djoshy closed 3 weeks ago

djoshy commented 1 month ago

- What I did This reorders the callback registration so the lister object is not empty when called

- How to verify it There should not be any daemon panics for CI runs as listed in https://issues.redhat.com/browse/OCPBUGS-37850

openshift-ci-robot commented 1 month ago

@djoshy: This pull request references Jira Issue OCPBUGS-37850, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4533): >**- What I did** >This reorders the callback registration so the lister object is not empty when called > >**- How to verify it** >There should not be any daemon panics for CI runs as listed inn https://issues.redhat.com/browse/OCPBUGS-37850 > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
djoshy commented 1 month ago

/jira refresh

openshift-ci-robot commented 1 month ago

@djoshy: This pull request references Jira Issue OCPBUGS-37850, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.18.0) matches configured target version for branch (4.18.0) * bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4533#issuecomment-2286630079): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
sergiordlr commented 3 weeks ago

Pre-merge tested using IPI on AWS

No panic was found in clusters with day1 and day2 techpreview installations.

These test cases were executed and passed:

"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-ConnectedOnly-NonPreRelease-Longduration-Medium-73630-Pin release images [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-NonPreRelease-High-73623-Pin images [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-High-73653-Pinned images with a ImageDigestMirrorSet mirroring a single repository [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-High-73657-Pinned images with a ImageDigestMirrorSet mirroring a domain [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-73631-Pinned images garbage collection [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-73635-Pod can use pinned images while no access to the registry [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-73659-Pinned images when disk-pressure [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sregidor-NonHyperShiftHOST-NonPreRelease-Medium-73361-Pinnedimageset invalid pinned images [Disruptive] [Serial]"
"[sig-mco] MCO Pinnedimages Author:sserafin-NonHyperShiftHOST-NonPreRelease-Longduration-High-73648-A rebooted node reconciles with the pinned images status [Disruptive] [Serial]"
"[sig-mco] MCO scale Author:sregidor-NonHyperShiftHOST-NonPreRelease-Medium-73636-Pinned images in scaled nodes [Disruptive] [Serial]"

We add the qe-approved label so that the PR can be merged, but in order to fully verify the jira ticket we need to post-merge test that there is no panic reported anymore in the CI logs.

/label qe-approved

openshift-ci-robot commented 3 weeks ago

@djoshy: This pull request references Jira Issue OCPBUGS-37850, which is valid.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.18.0) matches configured target version for branch (4.18.0) * bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4533): >**- What I did** >This reorders the callback registration so the lister object is not empty when called > >**- How to verify it** >There should not be any daemon panics for CI runs as listed in https://issues.redhat.com/browse/OCPBUGS-37850 > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
inesqyx commented 3 weeks ago

/retest-required

inesqyx commented 3 weeks ago

/lgtm Approach is sane and smart to me :) Thanks David for the quick fix!

openshift-ci[bot] commented 3 weeks ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: djoshy, inesqyx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/machine-config-operator/blob/master/OWNERS)~~ [djoshy] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci-robot commented 3 weeks ago

/retest-required

Remaining retests: 0 against base HEAD d59a67ffb38d62e329a11b62b8632a3e12666718 and 2 for PR HEAD abf897e7a06634325746aa0c830fa5537ae8fe44 in total

djoshy commented 3 weeks ago

/retest-required

djoshy commented 3 weeks ago

/test e2e-hypershift

djoshy commented 3 weeks ago

/test e2e-gcp-op

openshift-ci[bot] commented 3 weeks ago

@djoshy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere-ovn-upi-zones abf897e7a06634325746aa0c830fa5537ae8fe44 link false /test e2e-vsphere-ovn-upi-zones
ci/prow/e2e-azure-ovn-upgrade-out-of-change abf897e7a06634325746aa0c830fa5537ae8fe44 link false /test e2e-azure-ovn-upgrade-out-of-change

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
djoshy commented 3 weeks ago

/retest-required

openshift-ci-robot commented 3 weeks ago

@djoshy: Jira Issue OCPBUGS-37850: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-37850 has been moved to the MODIFIED state.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4533): >**- What I did** >This reorders the callback registration so the lister object is not empty when called > >**- How to verify it** >There should not be any daemon panics for CI runs as listed in https://issues.redhat.com/browse/OCPBUGS-37850 > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-bot commented 3 weeks ago

[ART PR BUILD NOTIFIER]

Distgit: ose-machine-config-operator This PR has been included in build ose-machine-config-operator-container-v4.18.0-202408221113.p0.gcce2e9e.assembly.stream.el9. All builds following this will include this PR.

djoshy commented 3 weeks ago

/cherrypick release-4.17 release-4.16

openshift-cherrypick-robot commented 3 weeks ago

@djoshy: new pull request created: #4546

In response to [this](https://github.com/openshift/machine-config-operator/pull/4533#issuecomment-2304607550): >/cherrypick release-4.17 release-4.16 Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.