openshift / cluster-etcd-operator

Operator to manage the lifecycle of the etcd members of an OpenShift cluster
Apache License 2.0
95 stars 127 forks source link

OCPBUGS-42083: go back to HA mode in bootstrap #1342

Closed tjungblu closed 19 hours ago

tjungblu commented 2 days ago

This reverts the quorum guard back to ensure we can't violate the quorum during bootstrap time.

We still keep this behavior in DelayedScaling, which is used by ass. installer.

openshift-ci-robot commented 2 days ago

@tjungblu: This pull request references Jira Issue OCPBUGS-42083, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1342): >This reverts the quorum guard back to ensure we can't violate the quorum during bootstrap time. > >We still keep this behavior in DelayedScaling, which is used by ass. installer. Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-etcd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
tjungblu commented 2 days ago

/payload ?

openshift-ci[bot] commented 2 days ago

@tjungblu: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

tjungblu commented 2 days ago

/payload-aggregate periodic-ci-openshift-release-master-ci-4.17-e2e-azure-ovn-upgrade 10

openshift-ci[bot] commented 2 days ago

@tjungblu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ca89d490-7a44-11ef-98f5-e2e0e34764c7-0

openshift-ci[bot] commented 2 days ago

@tjungblu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-etcd-scaling 99178370a6d0ce6d645d75656cdbe8115708341f link true /test e2e-aws-ovn-etcd-scaling
ci/prow/e2e-aws-ovn-serial 99178370a6d0ce6d645d75656cdbe8115708341f link true /test e2e-aws-ovn-serial
ci/prow/e2e-metal-ovn-ha-cert-rotation-shutdown 99178370a6d0ce6d645d75656cdbe8115708341f link false /test e2e-metal-ovn-ha-cert-rotation-shutdown
ci/prow/e2e-operator 99178370a6d0ce6d645d75656cdbe8115708341f link true /test e2e-operator
ci/prow/unit 99178370a6d0ce6d645d75656cdbe8115708341f link true /test unit
ci/prow/e2e-aws-etcd-recovery 99178370a6d0ce6d645d75656cdbe8115708341f link false /test e2e-aws-etcd-recovery
ci/prow/e2e-agnostic-ovn 99178370a6d0ce6d645d75656cdbe8115708341f link true /test e2e-agnostic-ovn
ci/prow/e2e-aws-etcd-certrotation 99178370a6d0ce6d645d75656cdbe8115708341f link false /test e2e-aws-etcd-certrotation
ci/prow/e2e-operator-fips 99178370a6d0ce6d645d75656cdbe8115708341f link false /test e2e-operator-fips

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
tjungblu commented 2 days ago

/payload-aggregate periodic-ci-openshift-release-master-ci-4.17-e2e-azure-ovn-upgrade 10

openshift-ci[bot] commented 2 days ago

@tjungblu: An error was encountered. No known errors were detected, please see the full error message for details.

Full error message. could not check if the user tjungblu is trusted for pull request openshift/cluster-etcd-operator#1342: error checking tjungblu for trust: error in IsMember(openshift): Get "http://ghproxy/orgs/openshift/members/tjungblu": dial tcp 172.30.229.2:80: i/o timeout

Please contact an administrator to resolve this issue.

tjungblu commented 2 days ago

/payload-aggregate periodic-ci-openshift-release-master-ci-4.17-e2e-azure-ovn-upgrade 10

openshift-ci[bot] commented 2 days ago

@tjungblu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/9b4eccd0-7a80-11ef-8997-1dd843ccb024-0

tjungblu commented 1 day ago

/payload-aggregate periodic-ci-openshift-release-master-ci-4.17-e2e-azure-ovn-upgrade 10

openshift-ci[bot] commented 1 day ago

@tjungblu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/28ce8970-7b02-11ef-84c3-96a8d480a8ab-0

hasbro17 commented 1 day ago

/lgtm

I'm guessing this will slow down the bootstrap scale up time. We should probably keep an eye on this in case this trips up any other tests that may be affected by that.

/hold

To verify the aggregate upgrade runs. Unhold whenever that seems good.

openshift-ci[bot] commented 1 day ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hasbro17, tjungblu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/cluster-etcd-operator/blob/master/OWNERS)~~ [hasbro17,tjungblu] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
tjungblu commented 1 day ago

/test ?

openshift-ci[bot] commented 1 day ago

@tjungblu: The following commands are available to trigger required jobs:

The following commands are available to trigger optional jobs:

Use /test all to run the following jobs that were automatically triggered:

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1342#issuecomment-2373398804): >/test ? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
tjungblu commented 1 day ago

/test e2e-aws-ovn-serial /test e2e-aws-ovn-single-node /test e2e-metal-assisted /test e2e-metal-ipi-ovn-ipv6

/retest

tjungblu commented 1 day ago

/test e2e-metal-assisted

tjungblu commented 1 day ago

/payload-aggregate periodic-ci-openshift-release-master-ci-4.17-e2e-azure-ovn-upgrade 10

openshift-ci[bot] commented 1 day ago

@tjungblu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c5bebc40-7b26-11ef-97b1-308333031afa-0

tjungblu commented 19 hours ago

/close

seems we don't need this for the time being

openshift-ci[bot] commented 19 hours ago

@tjungblu: Closed this PR.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1342#issuecomment-2376103586): >/close > >seems we don't need this for the time being Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
openshift-ci-robot commented 19 hours ago

@tjungblu: This pull request references Jira Issue OCPBUGS-42083. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1342): >This reverts the quorum guard back to ensure we can't violate the quorum during bootstrap time. > >We still keep this behavior in DelayedScaling, which is used by ass. installer. Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-etcd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.