openshift / router

Ingress controller for OpenShift
Apache License 2.0
68 stars 114 forks source link

OCPBUGS-33883: DO NOT MERGE: validate haproxy26-2.6.13-3.rhaos4.15.el8.x86_64.rpm #597

Closed frobware closed 3 months ago

frobware commented 4 months ago

/hold

openshift-ci[bot] commented 4 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from frobware. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/openshift/router/blob/release-4.15/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
frobware commented 4 months ago

Patch for branch rhaos-4.15-rhel-8 in https://pkgs.devel.redhat.com/cgit/rpms/haproxy/

From dc4ed2c3b68c4c30952110a13dd5c7041fe8942d Mon Sep 17 00:00:00 2001
From: Andrew McDermott <amcdermo@redhat.com>
Date: Thu, 16 May 2024 14:21:43 +0100
Subject: OCPBUGS-32369: HAProxy consuming high cpu usage

- Resolve https://issues.redhat.com/browse/OCPBUGS-32369 (HAProxy consuming high cpu usage)
- Carry fix for https://github.com/haproxy/haproxy/issues/2537
- Fix for issue 2537 picked from https://git.haproxy.org/?p=haproxy.git;a=commit;h=4a9e3e102e192b9efd17e3241a6cc659afb7e7dc
---
 ...y-only-tid-0-must-not-sleep-if-got-s.patch | 44 +++++++++++++++++++
 haproxy.spec                                  | 11 ++++-
 2 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100644 0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch

diff --git a/0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch b/0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch
new file mode 100644
index 0000000..50f54cc
--- /dev/null
+++ b/0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch
@@ -0,0 +1,44 @@
+From d256f8f3157af29875a915e87e171ec81194e88c Mon Sep 17 00:00:00 2001
+From: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>
+Date: Mon, 6 May 2024 14:24:41 +0200
+Subject: BUG/MINOR: haproxy: only tid 0 must not sleep if got signal
+
+This patch fixes the commit eea152ee68
+("BUG/MINOR: signals/poller: ensure wakeup from signals").
+
+There is some probability that run_poll_loop() becomes inifinite, if
+TH_FL_SLEEPING is withdrawn from all threads in the second signal_queue_len
+check, when a signal has received just after the first one.
+
+In such particular case, the 'wake' variable, which is used to terminate
+thread's poll loop is never reset to 0. So, we never enter to the "stopping"
+part of the run_poll_loop() and threads, except the one with id 0 (tid 0
+handles signals), will continue to call _do_poll() eternally and will never
+sleep, as its TH_FL_SLEEPING flag was unset.
+
+This flag needs to be removed only for the tid 0, as it was done in the first
+signal_queue_len check.
+
+This fixes an issue #2537 "infinite loop when shutting down".
+
+This fix must be backported in every stable version.
+---
+ src/haproxy.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/src/haproxy.c b/src/haproxy.c
+index a6349d420..6ed532527 100644
+--- a/src/haproxy.c
++++ b/src/haproxy.c
+@@ -2868,7 +2868,7 @@ void run_poll_loop()
+           if (thread_has_tasks()) {
+               activity[tid].wake_tasks++;
+               _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit);
+-          } else if (signal_queue_len) {
++          } else if (signal_queue_len && tid == 0) {
+               /* this check is required to avoid
+                * a race with wakeup on signals using wake_threads() */
+               _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit);
+-- 
+2.42.0
+
diff --git a/haproxy.spec b/haproxy.spec
index 707fa0a..597852e 100644
--- a/haproxy.spec
+++ b/haproxy.spec
@@ -10,7 +10,7 @@

 Name:           haproxy
 Version:        2.6.13
-Release:        2.rhaos4.15%{?dist}
+Release:        3.rhaos4.15%{?dist}
 Summary:        Do not ship, install or use this, use %{real_name} subpackage instead

 License:        GPLv2+
@@ -24,6 +24,9 @@ Patch0:         0001-BUG-MINOR-fd-always-remove-late-updates-when-freeing.patch
 # https://issues.redhat.com/browse/OCPBUGS-20325 (CVE-2023-40225)
 Patch1:         0001-BUG-MAJOR-http-reject-any-empty-content-length-heade.patch

+# https://issues.redhat.com/browse/OCPBUGS-32369
+Patch2:         0001-BUG-MINOR-haproxy-only-tid-0-must-not-sleep-if-got-s.patch
+
 BuildRequires:  openssl-devel
 BuildRequires:  pcre-devel
 BuildRequires:  zlib-devel
@@ -56,6 +59,7 @@ availability environments. Indeed, it can:
 %setup -q
 %patch0 -p1
 %patch1 -p1
+%patch2 -p1

 %build
 regparm_opts=
@@ -97,6 +101,11 @@ fi
 %{_sbindir}/%{name}

 %changelog
+* Thu May 16 2024 Andrew McDermott <amcdermo@redhat.com> - 2.6.13-3.rhaos4.15
+- Resolve https://issues.redhat.com/browse/OCPBUGS-32369 (HAProxy consuming high cpu usage)
+- Carry fix for https://github.com/haproxy/haproxy/issues/2537
+- Fix for issue 2537 picked from https://git.haproxy.org/?p=haproxy.git;a=commit;h=4a9e3e102e192b9efd17e3241a6cc659afb7e7dc
+
 * Wed Oct 11 2023 Andrew McDermott <amcdermo@redhat.com> - 2.6.13-2.rhaos4.15
 - Resolve https://issues.redhat.com/browse/OCPBUGS-20325 (CVE-2023-40225)

-- 
2.44.0
openshift-ci[bot] commented 4 months ago

@frobware: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
openshift-ci-robot commented 4 months ago

@frobware: This pull request references Jira Issue OCPBUGS-33883, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/router/pull/597): >/hold > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Frouter). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
gcs278 commented 4 months ago

Patch LGTM.

$ git checkout rhaos-4.15-rhel-8
$ git pull
$ git am ~/patches/OCPBUGS-32369-HAProxy-consuming-high-cpu-usage-4.15.patch
Applying: OCPBUGS-32369: HAProxy consuming high cpu usage
.git/rebase-apply/patch:46: space before tab in indent.
            if (thread_has_tasks()) {
.git/rebase-apply/patch:47: space before tab in indent.
                activity[tid].wake_tasks++;
.git/rebase-apply/patch:48: space before tab in indent.
                _HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit);
.git/rebase-apply/patch:51: space before tab in indent.
                /* this check is required to avoid
.git/rebase-apply/patch:52: space before tab in indent.
                 * a race with wakeup on signals using wake_threads() */
warning: squelched 3 whitespace errors
warning: 8 lines add whitespace errors.
$ git log -n 1
commit 8be6989d93e82f91ae25058b367d1ef61aa08229 (HEAD -> rhaos-4.15-rhel-8)
Author: Andrew McDermott <amcdermo@redhat.com>
Date:   Thu May 16 14:21:43 2024 +0100

    OCPBUGS-32369: HAProxy consuming high cpu usage

    - Resolve https://issues.redhat.com/browse/OCPBUGS-32369 (HAProxy consuming high cpu usage)
    - Carry fix for https://github.com/haproxy/haproxy/issues/2537
    - Fix for issue 2537 picked from https://git.haproxy.org/?p=haproxy.git;a=commit;h=4a9e3e102e192b9efd17e3241a6cc659afb7e7dc
$ git diff HEAD~ HEAD
[...]
frobware commented 4 months ago

/jira refresh

openshift-ci-robot commented 4 months ago

@frobware: This pull request references Jira Issue OCPBUGS-33883, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to [this](https://github.com/openshift/router/pull/597#issuecomment-2118007487): >/jira refresh > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Frouter). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci-robot commented 3 months ago

@frobware: This pull request references Jira Issue OCPBUGS-33883. The bug has been updated to no longer refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/router/pull/597): >/hold > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Frouter). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.