Closed jubittajohn closed 2 months ago
@jubittajohn: This pull request references ETCD-612 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: jubittajohn Once this PR has been reviewed and has the lgtm label, please assign tjungblu for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
@jubittajohn: This pull request references ETCD-612 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.
@jubittajohn: This pull request references ETCD-612 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.
+1
@jubittajohn once the CI here is mostly green, you can run the payload jobs using /payload.
For example /payload 4.17 nightly blocking
will run all 4.17 nightly jobs that are a must-have to generate a release ("blocking") with this PR. This tests several clouds and form factors of OpenShift.
You can check those out here as an example for the usual nightly runs: https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/releasestream/4.17.0-0.nightly/release/4.17.0-0.nightly-2024-06-20-005211
be aware, those tests are somewhat expensive to run, so use them sparingly when you feel like you're close to be done and you just want to have additional assurance that you don't break openshift as a whole somehow.
/retest-required
/retest-required
/retest-required
@jubittajohn: This pull request references ETCD-612 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.
@jubittajohn: This pull request references ETCD-612 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.
/retest
@jubittajohn: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/e2e-gcp-qe-no-capabilities | 2be2710a623444ee2a412544b9cac27417ce5815 | link | false | /test e2e-gcp-qe-no-capabilities |
ci/prow/e2e-aws-etcd-certrotation | 0ec8504fe821e477e5808b67e6ae7c76a1ef5764 | link | false | /test e2e-aws-etcd-certrotation |
ci/prow/e2e-aws-etcd-recovery | 0ec8504fe821e477e5808b67e6ae7c76a1ef5764 | link | false | /test e2e-aws-etcd-recovery |
ci/prow/e2e-aws-ovn-serial | 0ec8504fe821e477e5808b67e6ae7c76a1ef5764 | link | true | /test e2e-aws-ovn-serial |
ci/prow/e2e-aws-ovn-etcd-scaling | 0ec8504fe821e477e5808b67e6ae7c76a1ef5764 | link | true | /test e2e-aws-ovn-etcd-scaling |
ci/prow/e2e-metal-ovn-ha-cert-rotation-shutdown | 0ec8504fe821e477e5808b67e6ae7c76a1ef5764 | link | false | /test e2e-metal-ovn-ha-cert-rotation-shutdown |
ci/prow/e2e-metal-ovn-sno-cert-rotation-shutdown | 0ec8504fe821e477e5808b67e6ae7c76a1ef5764 | link | false | /test e2e-metal-ovn-sno-cert-rotation-shutdown |
Full PR test history. Your PR dashboard.
While performing rollouts and applying new manifests, Kubelet doesn't go through the eviction API and hence the PDB doesn't matter in this case. Because of this limitation, we can't fully rely on the functionality of guard pods to block the static pod rollout.
Instead of checking for quorum in all controllers that could initiate a revision rollout, the functionality of PDB is leveraged to block the static pod rollout.
A dedicated
quorumz
handler is introduced in the existingreadyz
container which checks for quorum similar to the existingCheckSafeToScaleCluster
functionality. This marks all etcd guard pods as NOT_READY when quorum is not safe, ensuring PDB is violated and blocking the additional pod scheduling.Removed the existing quorum checks in the different controllers.