Closed DumitruNiculai closed 6 months ago
Could you share ouput for show fd
and show sess all
CLI commands please ? Are these sockets on the frontend or the backend side ?
Other points, you're running with a large pair of timeout values of 2 hours. Do these sockets continue to accumulate past the two hours or do they remain stable ? It could be possible that these are "just" the result of some clients disappearing from the net after having sent only a FIN after their request.
Do you have any firewall anywhere in the chain, e.g. on the other side of these CLOSE_WAIT ? What could happen is that a client closes with a shutdown, the shutdown is passed to the other side, triggers a lower timeout on a firewall, that quicky closes the connection while for various reasons the server doesn't receive it. You could quickly end up with a FIN_WAIT1 on one side and a CLOSE_WAIT on the other side, waiting for the first timeout to trigger.
"option abortonclose" could be used to terminate half-closed connections, though this might or might not be what you want on TCP communications. Alternately you may also set "timeout client-fin" and "timeout server-fin" to much lower values to shorten the timeouts once a FIN was transmitted, in order to better deal with vanishing machines.
Good day Willy.
The issue has been resolved.
As you suggested we setup these parameters that helped us to get rid of those CLOSE_WAIT stuck connections: "timeout client-fin 5s" and "timeout server-fin 5s"
We are running High Availability PostgreSQL clusters in our Production system based on Patroni with HA Proxy as proxy server for user/application connections. We don't have any firewalls at the server sides. However, all the multiply applications are running from Kubernetes pods that sometimes fail and create the CLOSE_WAIT stuck connections
Thank you Willy for your help and support
Detailed Description of the Problem
We observe a steady growth of sockets in CLOSE_WAIT state right from the moment HAProxy 2.6.16 starts. It continues until the total number of sockets reaches maxconn and then the problem becomes worse as now new proper connections do not get accepted due to all the CLOSE_WAIT ones filling the slots.
haproxy -version
HAProxy version 2.6.16-c6a7346 2023/12/13 - https://haproxy.org/ Status: long-term supported branch - will stop receiving fixes around Q2 2027. Known bugs: http://www.haproxy.org/bugs/bugs-2.6.16.html Running on: Linux 4.18.0-477.27.1.el8_8.x86_64 #1 SMP Thu Aug 31 10:29:22 EDT 2023 x86_64
netstat -a -n -o -l -p | grep CLOSE_WAIT | grep 1037/haproxy | wc -l
436
Expected Behavior
No accumulation of CLOSE_WAIT sockets.
Steps to Reproduce the Behavior
Do you have any idea what may have caused this?
No response
Do you have an idea how to solve the issue?
No response
What is your configuration?
Output of
haproxy -vv
Last Outputs and Backtraces
No response
Additional Information
No response