openshift / machine-config-operator

Apache License 2.0
245 stars 409 forks source link

OCPBUGS-39131: fixes node scaledown for e2e tests #4553

Closed cheesesashimi closed 1 month ago

cheesesashimi commented 2 months ago

- What I did

Fixes: OCPBUGS-39131

Removes the node / Machine deletion call and ensures that the node scaledown function waits for all of the nodes to be ready and prioritizes the nodes it created for deletion. Also modifies GetRandomNode() to ensure that the node it returns is actually ready. If not, GetRandomNode() will poll for up to 5 minutes for a node to become ready.

- How to verify it

Run the e2e test suite. All tests should pass.

- Description for the changelog Fixes node scaledown for e2e tests

openshift-ci[bot] commented 2 months ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

cheesesashimi commented 2 months ago

/test e2e-gcp-op

openshift-ci-robot commented 2 months ago

@cheesesashimi: This pull request references Jira Issue OCPBUGS-39131, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4553): > > >**- What I did** > >**- How to verify it** > >**- Description for the changelog** > > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
cheesesashimi commented 2 months ago

/jira refresh

openshift-ci-robot commented 2 months ago

@cheesesashimi: This pull request references Jira Issue OCPBUGS-39131, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.18.0) matches configured target version for branch (4.18.0) * bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4553#issuecomment-2315833499): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
cheesesashimi commented 2 months ago

@sergiordlr I don't think this needs QE review since the issue and the change are isolated to our e2e test suite. That said, I'll leave it up to your discretion.

djoshy commented 2 months ago

/retest-required

djoshy commented 2 months ago

/retest

@cheesesashimi also looks like verify is failing on naming an unneeded argument. Should be an easy fix (:

test/helpers/utils.go:650:92: unused-parameter: parameter 'ctx' seems to be unused, consider removing or renaming it as _ (revive)
    err := wait.PollUntilContextTimeout(context.TODO(), 2*time.Second, waitPeriod, true, func(ctx context.Context) (bool, error) {
cheesesashimi commented 2 months ago

/test e2e-gcp-op /test unit

ptalgulk01 commented 2 months ago

/test e2e-gcp-op

ptalgulk01 commented 2 months ago

Hello @cheesesashimi, I see that in https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_machine-config-operator/4553/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1829418699031842816/artifacts/e2e-gcp-op/test/build-log.txt TestMetrics test is not present is it expected?

ptalgulk01 commented 2 months ago

Able to see TestMetrics test passing --- PASS: TestMetrics (108.10s) https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_machine-config-operator/4553/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1829635424264392704/artifacts/e2e-gcp-op/test/build-log.txt

/label qe-approved

openshift-ci-robot commented 2 months ago

@cheesesashimi: This pull request references Jira Issue OCPBUGS-39131, which is valid.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.18.0) matches configured target version for branch (4.18.0) * bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4553): >**- What I did** > >Fixes: OCPBUGS-39131 > >Removes the node / Machine deletion call and ensures that the node scaledown function waits for all of the nodes to be ready and prioritizes the nodes it created for deletion. Also modifies `GetRandomNode()` to ensure that the node it returns is actually ready. If not, `GetRandomNode()` will poll for up to 5 minutes for a node to become ready. > >**- How to verify it** > >Run the e2e test suite. All tests should pass. > >**- Description for the changelog** >Fixes node scaledown for e2e tests Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci[bot] commented 2 months ago

@cheesesashimi: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere-ovn-upi 8a5d92ca0a252b73dc8a601189031ce0c5a96084 link false /test e2e-vsphere-ovn-upi
ci/prow/e2e-azure-ovn-upgrade-out-of-change 8a5d92ca0a252b73dc8a601189031ce0c5a96084 link false /test e2e-azure-ovn-upgrade-out-of-change

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
djoshy commented 1 month ago

Last round of changes look good to me and CI looks green. Thanks for working on this, Zack!

/lgtm

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi, djoshy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/machine-config-operator/blob/master/OWNERS)~~ [cheesesashimi,djoshy] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD a3d9b1fe6cb9c5140e0a81091f91aef5f8dcc4ba and 2 for PR HEAD 8a5d92ca0a252b73dc8a601189031ce0c5a96084 in total

openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD a3d9b1fe6cb9c5140e0a81091f91aef5f8dcc4ba and 2 for PR HEAD 8a5d92ca0a252b73dc8a601189031ce0c5a96084 in total

openshift-ci-robot commented 1 month ago

@cheesesashimi: Jira Issue OCPBUGS-39131: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-39131 has been moved to the MODIFIED state.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4553): >**- What I did** > >Fixes: OCPBUGS-39131 > >Removes the node / Machine deletion call and ensures that the node scaledown function waits for all of the nodes to be ready and prioritizes the nodes it created for deletion. Also modifies `GetRandomNode()` to ensure that the node it returns is actually ready. If not, `GetRandomNode()` will poll for up to 5 minutes for a node to become ready. > >**- How to verify it** > >Run the e2e test suite. All tests should pass. > >**- Description for the changelog** >Fixes node scaledown for e2e tests Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-bot commented 1 month ago

[ART PR BUILD NOTIFIER]

Distgit: ose-machine-config-operator This PR has been included in build ose-machine-config-operator-container-v4.18.0-202409042212.p0.g70d43e6.assembly.stream.el9. All builds following this will include this PR.