openshift / origin

Conformance test suite for OpenShift
http://www.openshift.org
Apache License 2.0
8.5k stars 4.7k forks source link

ETCD-664: Add e2e test for scaling when CPMS is disabled #29086

Closed jubittajohn closed 1 month ago

jubittajohn commented 2 months ago

The following test covers a basic vertical scaling scenario when CPMS is disabled and validates that the scale-down does not happen before the scale-up event.

  1. If the CPMS is active, first disable it by deleting the CPMS custom resource
  2. Delete the machine
  3. Create a new master machine that belongs to the same index as the deleted machine
  4. Scale-up happens first before the scale-down
  5. Validate scales-up happens first by verifying 4 voting members in etcd cluster
  6. Then scale-down is validated by confirming the member removal and changes in the cluster membership
openshift-ci-robot commented 2 months ago

@jubittajohn: This pull request references ETCD-664 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to [this](https://github.com/openshift/origin/pull/29086): >The following test covers a basic vertical scaling scenario when CPMS is disabled and validates that the scale-down does not happen before the scale-up event. > >1. If the CPMS is active, first disable it by deleting the CPMS custom resource >2. Delete the machine >3. Create a new master machine that belongs to the same index as the deleted machine >4. Scale-up happens first before the scale-down >5. Validate scales-up happens first by verifying 4 voting members in etcd cluster >6. Then scale-down is validated by confirming the member removal and changes in the cluster membership Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Forigin). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
jubittajohn commented 2 months ago

/test e2e-aws-ovn-etcd-scaling

jubittajohn commented 2 months ago

/retest

openshift-trt-bot commented 2 months ago

Job Failure Risk Analysis for sha: 2c2a98187ab5bc49543ce2f659a0817b0ccd3590

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade Low
[sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers for ns/openshift-marketplace
This test has passed 79.04% of 229 runs on release 4.18 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:single Upgrade:micro] in the last week.
openshift-trt-bot commented 2 months ago

Job Failure Risk Analysis for sha: dd15b95cc03c5ca5ac0bdbfe50a929a326e8d362

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade IncompleteTests
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node Medium
[sig-node] static pods should start after being created
This test has passed 97.92% of 48 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node'] in the last 14 days.

Open Bugs
Static pod controller pods sometimes fail to start
jubittajohn commented 2 months ago

/test e2e-aws-ovn-etcd-scaling

jubittajohn commented 2 months ago

/retest

openshift-trt-bot commented 2 months ago

Job Failure Risk Analysis for sha: efdd1fec5bcb3d57687f12117739778574ed8336

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-upgrade High
[sig-api-machinery] disruption/kube-api connection/new should be available throughout the test
This test has passed 99.88% of 802 runs on jobs ['periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade'] in the last 14 days.
---
[sig-api-machinery] disruption/openshift-api connection/new should be available throughout the test
This test has passed 99.88% of 802 runs on jobs ['periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade'] in the last 14 days.
---
[sig-api-machinery] disruption/oauth-api connection/new should be available throughout the test
This test has passed 99.75% of 802 runs on jobs ['periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade'] in the last 14 days.
jubittajohn commented 2 months ago

/test e2e-aws-ovn-etcd-scaling

openshift-trt-bot commented 2 months ago

Job Failure Risk Analysis for sha: 2734309c3d80a5ffff781326aadc830762d6832f

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade Medium
[bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above info
This test has passed 84.31% of 153 runs on release 4.18 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:single Upgrade:micro] in the last week.

Open Bugs
alert/KubeAPIErrorBudgetBurn should not be at or above info
jubittajohn commented 2 months ago

/test e2e-aws-ovn-etcd-scaling

openshift-trt-bot commented 2 months ago

Job Failure Risk Analysis for sha: 441b43113c912b46b9cafaefc6e9078d651987dc

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-ipsec-serial High
[sig-architecture] platform pods should not exit more than once with a non-zero exit code
This test has passed 100.00% of 20 runs on jobs ['periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-serial' 'periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-serial'] in the last 14 days.

Open Bugs
Excessive Restarts on ingress operator
jubittajohn commented 2 months ago

/test e2e-aws-ovn-etcd-scaling

jubittajohn commented 2 months ago

/test e2e-aws-ovn-fips

jubittajohn commented 2 months ago

/retest-required

hasbro17 commented 1 month ago

/test e2e-aws-ovn-etcd-scaling /test e2e-gcp-ovn-etcd-scaling /test e2e-azure-ovn-etcd-scaling /test e2e-vsphere-ovn-etcd-scaling

/hold

Seeing if this new test doesn't trip anything on any of the other platforms.

hasbro17 commented 1 month ago

/lgtm /approve

@jubittajohn you can remove the hold once the test clears for the above presubmits.

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hasbro17, jubittajohn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[test/extended/etcd/OWNERS](https://github.com/openshift/origin/blob/master/test/extended/etcd/OWNERS)~~ [hasbro17] - ~~[test/extended/util/annotate/generated/OWNERS](https://github.com/openshift/origin/blob/master/test/extended/util/annotate/generated/OWNERS)~~ [hasbro17] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
jubittajohn commented 1 month ago

/retest

jubittajohn commented 1 month ago

/hold cancel

openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD 89321cb9fabeff3f17c4cb328044f58c8708a397 and 2 for PR HEAD d761544c17c7c3fdc258d155dc671a60a11d2ced in total

openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD eb784803501f1877cfdfe49749ac6c4cdaa7c6cd and 1 for PR HEAD d761544c17c7c3fdc258d155dc671a60a11d2ced in total

openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD fd6fe36319c39b51ab0f02ecb8e2777c0e1bb210 and 2 for PR HEAD d761544c17c7c3fdc258d155dc671a60a11d2ced in total

openshift-trt-bot commented 1 month ago

Job Failure Risk Analysis for sha: d761544c17c7c3fdc258d155dc671a60a11d2ced

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-azure-ovn-etcd-scaling Low
[bz-kube-storage-version-migrator] clusteroperator/kube-storage-version-migrator should not change condition/Available
This test has passed 50.00% of 4 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.18-e2e-azure-ovn-etcd-scaling' 'periodic-ci-openshift-release-master-nightly-4.17-e2e-azure-ovn-etcd-scaling'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-etcd-scaling Low
[bz-kube-storage-version-migrator] clusteroperator/kube-storage-version-migrator should not change condition/Available
This test has passed 25.00% of 4 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-etcd-scaling' 'periodic-ci-openshift-release-master-nightly-4.17-e2e-aws-ovn-etcd-scaling'] in the last 14 days.
openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD fd6fe36319c39b51ab0f02ecb8e2777c0e1bb210 and 2 for PR HEAD d761544c17c7c3fdc258d155dc671a60a11d2ced in total

openshift-ci[bot] commented 1 month ago

@jubittajohn: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-etcd-scaling d761544c17c7c3fdc258d155dc671a60a11d2ced link false /test e2e-aws-ovn-etcd-scaling
ci/prow/e2e-azure-ovn-etcd-scaling d761544c17c7c3fdc258d155dc671a60a11d2ced link false /test e2e-azure-ovn-etcd-scaling
ci/prow/e2e-aws-ovn-ipsec-serial d761544c17c7c3fdc258d155dc671a60a11d2ced link false /test e2e-aws-ovn-ipsec-serial
ci/prow/e2e-aws-ovn-kube-apiserver-rollout d761544c17c7c3fdc258d155dc671a60a11d2ced link false /test e2e-aws-ovn-kube-apiserver-rollout

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD fd6fe36319c39b51ab0f02ecb8e2777c0e1bb210 and 2 for PR HEAD d761544c17c7c3fdc258d155dc671a60a11d2ced in total

openshift-ci-robot commented 1 month ago

/retest-required

Remaining retests: 0 against base HEAD 5693bd4ace9e04b55a385acef12ca613b463541c and 2 for PR HEAD d761544c17c7c3fdc258d155dc671a60a11d2ced in total

openshift-bot commented 1 month ago

[ART PR BUILD NOTIFIER]

Distgit: openshift-enterprise-tests This PR has been included in build openshift-enterprise-tests-container-v4.18.0-202410101441.p0.gba980ef.assembly.stream.el9. All builds following this will include this PR.