openshift / router

Ingress controller for OpenShift
Apache License 2.0
74 stars 116 forks source link

OCPBUGS-33900: DO NOT MERGE: validate haproxy26-2.6.13-3.rhaos4.14.el8.x86_64.rpm #599

Closed frobware closed 5 months ago

frobware commented 6 months ago

/hold

openshift-ci[bot] commented 6 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from frobware. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/openshift/router/blob/release-4.14/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
frobware commented 6 months ago

Patch to be applied to https://pkgs.devel.redhat.com/cgit/rpms/haproxy/log/?h=rhaos-4.14-rhel-8

From e91eb34dd5b1b9ac45c41da4062d1253564c59d9 Mon Sep 17 00:00:00 2001
From: Andrew McDermott <amcdermo@redhat.com>
Date: Fri, 17 May 2024 17:03:28 +0100
Subject: OCPBUGS-32369: HAProxy consuming high cpu usage

- Resolve https://issues.redhat.com/browse/OCPBUGS-32369 (HAProxy consuming high cpu usage)
- Carry fix for https://github.com/haproxy/haproxy/issues/2537
- Fix for issue 2537 picked from https://git.haproxy.org/?p=haproxy.git;a=commit;h=4a9e3e102e192b9efd17e3241a6cc659afb7e7dc
---
 ...y-only-tid-0-must-not-sleep-if-got-s.patch | 44 +++++++++++++++++++
 haproxy.spec                                  | 11 ++++-
 2 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100644 0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch

diff --git a/0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch b/0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch
new file mode 100644
index 0000000..50f54cc
--- /dev/null
+++ b/0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch
@@ -0,0 +1,44 @@
+From d256f8f3157af29875a915e87e171ec81194e88c Mon Sep 17 00:00:00 2001
+From: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>
+Date: Mon, 6 May 2024 14:24:41 +0200
+Subject: BUG/MINOR: haproxy: only tid 0 must not sleep if got signal
+
+This patch fixes the commit eea152ee68
+("BUG/MINOR: signals/poller: ensure wakeup from signals").
+
+There is some probability that run_poll_loop() becomes inifinite, if
+TH_FL_SLEEPING is withdrawn from all threads in the second signal_queue_len
+check, when a signal has received just after the first one.
+
+In such particular case, the 'wake' variable, which is used to terminate
+thread's poll loop is never reset to 0. So, we never enter to the "stopping"
+part of the run_poll_loop() and threads, except the one with id 0 (tid 0
+handles signals), will continue to call _do_poll() eternally and will never
+sleep, as its TH_FL_SLEEPING flag was unset.
+
+This flag needs to be removed only for the tid 0, as it was done in the first
+signal_queue_len check.
+
+This fixes an issue #2537 "infinite loop when shutting down".
+
+This fix must be backported in every stable version.
+---
+ src/haproxy.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/src/haproxy.c b/src/haproxy.c
+index a6349d420..6ed532527 100644
+--- a/src/haproxy.c
++++ b/src/haproxy.c
+@@ -2868,7 +2868,7 @@ void run_poll_loop()
+           if (thread_has_tasks()) {
+               activity[tid].wake_tasks++;
+               _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit);
+-          } else if (signal_queue_len) {
++          } else if (signal_queue_len && tid == 0) {
+               /* this check is required to avoid
+                * a race with wakeup on signals using wake_threads() */
+               _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit);
+-- 
+2.42.0
+
diff --git a/haproxy.spec b/haproxy.spec
index 4067e3a..019976c 100644
--- a/haproxy.spec
+++ b/haproxy.spec
@@ -10,7 +10,7 @@

 Name:           haproxy
 Version:        2.6.13
-Release:        2.rhaos4.14%{?dist}
+Release:        3.rhaos4.14%{?dist}
 Summary:        Do not ship, install or use this, use %{real_name} subpackage instead

 License:        GPLv2+
@@ -24,6 +24,9 @@ Patch0:         0001-BUG-MINOR-fd-always-remove-late-updates-when-freeing.patch
 # https://issues.redhat.com/browse/OCPBUGS-20325 (CVE-2023-40225)
 Patch1:         0001-BUG-MAJOR-http-reject-any-empty-content-length-heade.patch

+# https://issues.redhat.com/browse/OCPBUGS-32369
+Patch2:         0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch
+
 BuildRequires:  openssl-devel
 BuildRequires:  pcre-devel
 BuildRequires:  zlib-devel
@@ -56,6 +59,7 @@ availability environments. Indeed, it can:
 %setup -q
 %patch0 -p1
 %patch1 -p1
+%patch2 -p1

 %build
 regparm_opts=
@@ -97,6 +101,11 @@ fi
 %{_sbindir}/%{name}

 %changelog
+* Fri May 17 2024 Andrew McDermott <amcdermo@redhat.com> - 2.6.13-3.rhaos4.14
++- Resolve https://issues.redhat.com/browse/OCPBUGS-32369 (HAProxy consuming high cpu usage)
++- Carry fix for https://github.com/haproxy/haproxy/issues/2537
++- Fix for issue 2537 picked from https://git.haproxy.org/?p=haproxy.git;a=commit;h=4a9e3e102e192b9efd17e3241a6cc659afb7e7dc
+
 * Wed Oct 11 2023 Andrew McDermott <amcdermo@redhat.com> - 2.6.13-2.rhaos4.14
 - Resolve https://issues.redhat.com/browse/OCPBUGS-20325 (CVE-2023-40225)

-- 
2.42.0
frobware commented 6 months ago

/hold

openshift-ci-robot commented 6 months ago

@frobware: This pull request references Jira Issue OCPBUGS-33900, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/router/pull/599): >/hold > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Frouter). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
frobware commented 6 months ago

/jira refresh

openshift-ci-robot commented 6 months ago

@frobware: This pull request references Jira Issue OCPBUGS-33900, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to [this](https://github.com/openshift/router/pull/599#issuecomment-2118002967): >/jira refresh > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Frouter). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
gcs278 commented 6 months ago

Patch LGTM:

$ git checkout rhaos-4.14-rhel-8
$ git pull
$ git am --abort 
$ git am ~/patches/OCPBUGS-32369-HAProxy-consuming-high-cpu-usage-4.14.patch
Applying: OCPBUGS-32369: HAProxy consuming high cpu usage
.git/rebase-apply/patch:46: space before tab in indent.
            if (thread_has_tasks()) {
.git/rebase-apply/patch:47: space before tab in indent.
                activity[tid].wake_tasks++;
.git/rebase-apply/patch:48: space before tab in indent.
                _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit);
.git/rebase-apply/patch:51: space before tab in indent.
                /* this check is required to avoid
.git/rebase-apply/patch:52: space before tab in indent.
                 * a race with wakeup on signals using wake_threads() */
warning: squelched 3 whitespace errors
warning: 8 lines add whitespace errors.
$ git log -n 1
commit 95500f8f173b4c80f8eb77e26ebadeb6008fb0f1 (HEAD -> rhaos-4.14-rhel-8)
Author: Andrew McDermott <amcdermo@redhat.com>
Date:   Fri May 17 17:03:28 2024 +0100

    OCPBUGS-32369: HAProxy consuming high cpu usage

    - Resolve https://issues.redhat.com/browse/OCPBUGS-32369 (HAProxy consuming high cpu usage)
    - Carry fix for https://github.com/haproxy/haproxy/issues/2537
    - Fix for issue 2537 picked from https://git.haproxy.org/?p=haproxy.git;a=commit;h=4a9e3e102e192b9efd17e3241a6cc659afb7e7dc
$ git diff HEAD~ HEAD
[...]
openshift-ci[bot] commented 6 months ago

@frobware: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-ovn-ipv6 c2f82798f130a8e34843480ddcd7feea97e7bae1 link false /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
frobware commented 5 months ago

Closing as change has now been pushed.

Change pushed: https://pkgs.devel.redhat.com/cgit/rpms/haproxy/commit/?h=rhaos-4.14-rhel-8&id=e91eb34dd5b1b9ac45c41da4062d1253564c59d9

brew build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=61503482

openshift-ci-robot commented 5 months ago

@frobware: This pull request references Jira Issue OCPBUGS-33900. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state.

In response to [this](https://github.com/openshift/router/pull/599): >/hold > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Frouter). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.