Open frobware opened 3 days ago
Hello @frobware! Some important instructions when contributing to openshift/api: API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: frobware Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
@frobware: This pull request references Jira Issue OCPBUGS-43745, which is invalid:
Comment /jira refresh
to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
The bug has been updated to refer to the pull request using the external bug tracker.
/jira refresh
@frobware: This pull request references Jira Issue OCPBUGS-43745, which is invalid:
Comment /jira refresh
to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
@frobware: The following test failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/integration | 84689bf6752251547541a87d3cfb891f9c6add29 | link | true | /test integration |
Full PR test history. Your PR dashboard.
Introduce a new knob,
IdleConnectionTerminationPolicy
, in the IngressController configuration to control how idle connections are handled during router reloads.Context
In OCPBUGS-32044, the
idle-close-on-response
option was unconditionally added to the HAProxy confuguration to address issues with incoming HTTP requests failing during router reloads. This issue primarily affected Apache HTTPClient versions prior to 5.0, which do not gracefully handle connection resets. Adding the option ensured that idle connections were left open to handle one final request before being closed.Historically, HAProxy 2.2 maintained idle connections during router reloads by default, allowing requests on those connections to complete even when routing configuration changes were applied. Starting with HAProxy 2.4, the default behaviour changed to close idle connections immediately during soft reloads.
To accommodate existing clients dependent on the HAProxy 2.2 behaviour, the unconditional addition of idle-close-on-response restored the previous OpenShift status quo, where customers upgrading their OpenShift clusters experienced a behaviour change due to the jump from HAProxy 2.2 to 2.6, which altered the default handling of idle connections during router reloads.
However, unconditionally enabling
idle-close-on-response
has now led to issues (OCPBUGS-43745) with Route backend switching. When a Route switches its service backend, requests on persistent connections could continue being routed to the previously active backend due to HAProxy handling these connections in the old process. This behaviour occurs until the connection is closed, either by a new request, the expiration of the client keep-alive, or the expiration of the HAProxytimeout http-keep-alive 300s
. While this behaviour is desirable in some cases (e.g., for clients sensitive to connection resets), it can lead to temporary inconsistencies and unexpected routing behaviour during backend switching.This PR addresses these regressions by making the behaviour configurable through a new knob.
Changes
IdleConnectionTerminationPolicy
, to the IngressController configuration.Behavioural Differences
Immediate (New Default in OpenShift 4.19+):
Deferred (Default for backports to 4.14–4.18):
timeout http-keep-alive
(300 seconds in OpenShift).References: