openshift / machine-config-operator

Apache License 2.0
244 stars 396 forks source link

OCPBUGS-29444: Rewrote TestFirstBootHasSSHKeys e2e test in pure Go #4415

Open cheesesashimi opened 1 month ago

cheesesashimi commented 1 month ago

- What I did

I rewrote the TestFirstBootHasSSHKeys e2e test in pure Go to ensure that it uses the official Machine Go client instead of shelling out to the oc command. In the process of doing so, I observed a couple of things:

I suspect that both of these was causing problems for the TestFirstBootHasSSHKeys when running against a FIPS-enabled cluster for some reason, although the root cause is still elusive.

- How to verify it

Run the optional e2e-aws-ovn-fips-op CI job and the required e2e-gcp-op jobs against this PR. In both jobs, the TestFirstBootHasSSHKeys e2e test should pass.

- Description for the changelog Rewrote TestFirstBootHasSSHKeys e2e test in pure Go

openshift-ci-robot commented 1 month ago

@cheesesashimi: This pull request references Jira Issue OCPBUGS-29444, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4415): >**- What I did** > >I added the full object kind for machinesets and machines to the `oc` command >incantations used by the `TestFirstBootHasSSHKeys` and ensured that the >KUBECONFIG env var from the ClientSet is passed to the `oc` command. > >I suspect that both of these was causing problems for the >`TestFirstBootHasSSHKeys` when running in a FIPS context for some reason. The >root cause for that is unknown though. > >**- How to verify it** > >Run the optional `e2e-aws-ovn-fips-op` CI job against this PR. The `TestFirstBootHasSSHKeys` e2e test should pass. > >**- Description for the changelog** >Use full object name for machine objects and explicitly use KUBECONFIG for e2e test > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci[bot] commented 1 month ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/machine-config-operator/blob/master/OWNERS)~~ [cheesesashimi] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
cheesesashimi commented 1 month ago

/test e2e-aws-ovn-fips-op

cheesesashimi commented 1 month ago

/test e2e-aws-ovn-fips-op /test e2e-gcp-op

cheesesashimi commented 1 month ago

/test e2e-aws-ovn-fips-op /test e2e-gcp-op

cheesesashimi commented 1 month ago

/test e2e-aws-ovn-fips-op /test e2e-gcp-op

cheesesashimi commented 3 weeks ago

/test e2e-aws-ovn-fips-op

cheesesashimi commented 3 weeks ago

/jira refresh

openshift-ci-robot commented 3 weeks ago

@cheesesashimi: This pull request references Jira Issue OCPBUGS-29444, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4415#issuecomment-2186896275): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
cheesesashimi commented 3 weeks ago

/jira refresh

openshift-ci-robot commented 3 weeks ago

@cheesesashimi: This pull request references Jira Issue OCPBUGS-29444, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.17.0) matches configured target version for branch (4.17.0) * bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4415#issuecomment-2186898033): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
cheesesashimi commented 3 weeks ago

@sergiordlr This change was confined to our e2e test suite, so I'm not sure if there's any verification that needs to be done here. I'll let you make that call.

openshift-ci[bot] commented 3 weeks ago

@cheesesashimi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn-upgrade-out-of-change 0bffba8a7d40079458eba330b0a1db776869dc13 link false /test e2e-azure-ovn-upgrade-out-of-change

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
sergiordlr commented 2 weeks ago

@cheesesashimi Yes, I agree. No need for extra qe verification.

Test case is passing in https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_machine-config-operator/4415/pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn-fips-op/1805267532370677760/artifacts/e2e-aws-ovn-fips-op/test/build-log.txt

--- PASS: TestFirstBootHasSSHKeys (396.10s)

/label qe-approved

openshift-ci-robot commented 2 weeks ago

@cheesesashimi: This pull request references Jira Issue OCPBUGS-29444, which is valid.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.17.0) matches configured target version for branch (4.17.0) * bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4415): >**- What I did** > >I rewrote the `TestFirstBootHasSSHKeys` e2e test in pure Go to ensure that it uses the official Machine Go client instead of shelling out to the `oc` command. In the process of doing so, I observed a couple of things: > >- The original version of the `TestFirstBootHasSSHKeys` e2e test was running `$ oc scale --replicas=2` to start, and then running `$ oc scale --replicas=1` to scale back down again after cleanup. Doing this would leave the worker pool with a single node. >- Additionally, I observed a few issues which can best be described as weird shell-scripting gremlins caused by parsing the output of `oc` with typical shell commands such as `head` or `grep`. Writing shell scripts to parse output isn't a problem, but I suspect that output formatting for `oc` may have changed slightly between the time the test was written and now. > >I suspect that both of these was causing problems for the `TestFirstBootHasSSHKeys` when running against a FIPS-enabled cluster for some reason, although the root cause is still elusive. > >**- How to verify it** > >Run the optional `e2e-aws-ovn-fips-op` CI job and the required `e2e-gcp-op` jobs against this PR. In both jobs, the `TestFirstBootHasSSHKeys` e2e test should pass. > >**- Description for the changelog** >Rewrote TestFirstBootHasSSHKeys e2e test in pure Go Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.