haproxy / haproxy

HAProxy Load Balancer's development branch (mirror of git.haproxy.org)
https://git.haproxy.org/
Other
4.91k stars 791 forks source link

Performance regression on 2.4 branch #1463

Closed arouene closed 2 years ago

arouene commented 2 years ago

Detailed Description of the Problem

We recently migrated from HAProxy 2.3.16 to HAProxy 2.4.8 on our infrastructure.

We have immediately acknowledge a degradation in latency of responses, HAProxy adding up to 30sec in latency.

(The yellow curve is the %Ta time from the HAProxy logs) Screenshot from 2021-11-25 15-29-26

This degradation comes with an increase of TCP TimeWait.

(On the left side is HAProxy 2.3.16, on the right side of the graph is HAProxy 2.4.8) Screenshot from 2021-11-25 15-57-13

And also a decrease of TCP Established connections.

(On the left side is HAProxy 2.3.16, on the right side of the graph is HAProxy 2.4.8) Screenshot from 2021-11-25 15-57-33

We tested versions 2.0.13, 2.3.1 and 2.3.16 without problem. And without any modification in the config file or the OS, we tested version 2.4.0 and 2.4.8 which then presented the above problem.

Expected Behavior

No more latency than HAProxy 2.3 branch

Steps to Reproduce the Behavior

Migration steps from 2.3 to 2.4

  1. Haproxy 2.3.16 is running
  2. Compiling HAProxy 4.0.8 (with make TARGET=linux-glibc MALLOC=libc USE_OPENSSL=1 USE_PCR2_JIT=1 USE_LUA=1 USE_SYSTEMD=1)
  3. Sopping actual HAProxy (with systemctl stop haproxy)
  4. Deploying the new HAProxy (make install)
  5. Starting the new HAProxy (systemctl start haproxy)

Do you have any idea what may have caused this?

The changelog of 2.4 shows that the TimeWaits has been reworked and improved, may be it's a side effect as we acknowledge an increase of timewaits on our haproxy 2.4.

Do you have an idea how to solve the issue?

No response

What is your configuration?

global
    log     /dev/log len 2048 format raw local0
    maxconn 10000
    user    haproxy
    group   haproxy
    stats socket ipv4@172.16.4.73:9999 level admin
    stats timeout 30s
    ssl-default-bind-options no-tlsv11 no-tlsv10 no-sslv3 no-tls-tickets
    ssl-default-bind-ciphers ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:DHE-DSS-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:DHE-DSS-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-CCM8:ECDHE-ECDSA-AES256-CCM:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES256-CCM8:DHE-RSA-AES256-CCM:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
    lua-load /usr/local/etc/haproxy/lua/robotstxt.lua

defaults
    mode                    http
    log                     global
    option                  tcp-check
    tcp-check               connect
    log-format              {\"appname\":\"haproxy\",\"backend_queue\":%bq,\"back_name\":\"%b\",\"back_server\":\"%s\",\"ipAddress\":\"%ci\",\"BytesSent\":%B,\"bytesRcvd\":%U,\"timeTaken\":%Ta,\"serverPort\":%fp,\"status\":\"%ST\",\"servername\":\"%[capture.req.hdr(0)]\",\"uri\":\"%HP\",\"query\":\"%HQ\",\"protocol\":\"%HV\",\"method\":\"%HM\",\"server_ip\":\"%bi\",\"frontend_ip\":\"%fi\",\"backend_ip\":\"%si\",\"ssl_version\":\"%sslv\",\"ssl_cipher_used\":\"%sslc\",\"authorization\":\"%[capture.req.hdr(1)]\",\"X-Correlation-ID\":\"%ID\",\"userAgent\":\"%[capture.req.hdr(2)]\"}

    timeout connect         5000
    timeout client          50000
    timeout server          2400000
    errorfile               503 /usr/local/etc/haproxy/errors/504.http
    errorfile               504 /usr/local/etc/haproxy/errors/504.http

frontend http-in
    bind *:80
    bind 172.16.4.71:443 ssl crt /etc/pki/tls/private/domain.pem alpn h2,http/1.1
    bind 172.16.4.73:443 ssl crt /etc/pki/tls/private/domain.pem alpn h2,http/1.1
    maxconn 10000
    option forwardfor except 127.0.0.1
    option accept-invalid-http-request
    stick-table type ip size 500k expire 60s store conn_cnt,conn_cur,conn_rate(60s),http_req_rate(60s),http_err_rate(60s)
    stick on cookie(authToken) table backend_default
    stick on hdr(Authorization) table bk_api
    acl DDOS_protection_get path_beg -i /getFile.ashx
    acl DDOS_protection_post path_beg -i /login
    acl METHOD_POST method POST
    acl domainLocal hdr_reg(host) ^.*\.domain\.local$
    acl has_api_key req.hdr(Authorization) -m found
    acl has_cookie_authToken hdr_sub(cookie) authToken
    acl http      ssl_fc,not
    acl ip_locales src 172.16.0.0/20 10.0.0.0/24
    acl is_app_mobile_user_agent capture.req.hdr(2) -m sub app-mobile
    acl method_options method OPTIONS
    acl network_allowed_ddos src 172.16.0.1/16
    acl static_content path_end  -i .jpg .jpeg .gif .png .bmp .svg .ttf .eot .css .js .htm .html .ico .woff .map
    tcp-request content track-sc0 src if { path_beg -i /getFile.ashx }
    tcp-request content track-sc0 src if !has_cookie_authToken !has_api_key  !network_allowed_ddos
    tcp-request content track-sc0 cookie(authToken) table backend_default if !static_content !network_allowed_ddos
    tcp-request content track-sc0 hdr(Authorization) table bk_api if !static_content  !network_allowed_ddos_paris  !network_allowed_ddos
    tcp-request inspect-delay 5s
    http-request set-header X-Forwarded-Proto https if { ssl_fc }
    http-request del-header X-Forwarded-For
    http-request add-header X-Cluster-Name c4 if static_path OR { req.hdr(Host),map(/usr/local/etc/haproxy/backend_cluster.map) -m found }
    http-request capture req.hdr(Host) id 0
    http-request capture req.hdr(Authorization) id 1
    http-request capture req.hdr(User-Agent) id 2
    http-request redirect scheme https if http !domainLocal !forceHTTPS
    http-response set-header X-Content-Type-Options nosniff
    http-response set-header X-Robots-Tag "noindex, nofollow, noarchive, nosnippet"
    http-response set-header Strict-Transport-Security max-age=31536000;\ includeSubdomains;\ preload
    http-response set-header Cache-Control no-cache,\ no-store,\ must-revalidate
    http-response set-header Content-Security-Policy "default-src 'self' *.domain.net"
    http-response set-header Content-Security-Policy upgrade-insecure-requests
    http-response set-header Permissions-Policy "microphone=()"
    http-response set-header Referrer-Policy no-referrer
    http-response del-header Server
    http-response del-header X-AspNet-Version
    http-response del-header X-AspNetMvc-Version
    http-response del-header X-Powered-By
    unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid
    unique-id-header X-Correlation-ID
    capture request header Host len 64
    capture request header Authorization len 64
    capture request header User-Agent len 256
    declare capture request len 4000
    use_backend bk_app-c4-App.Banking if { path_beg -i /app-banking }
    default_backend backend_default

backend backend_default
    balance roundrobin
    fullconn 2200
    stick-table type string len 64 size 500k expire 60s store conn_cnt,conn_cur,conn_rate(60s),http_err_cnt,http_req_rate(60s),http_err_rate(60s)

backend bk_app-c4-App.Banking
    balance roundrobin
    fullconn 10000
    server prod-c4-iis10 172.16.4.19:80 maxconn 2500 check weight 100 agent-check agent-inter 2s agent-addr 172.16.4.19 agent-port 4243
    server prod-c4-iis11 172.16.4.20:80 maxconn 2500 check weight 100 agent-check agent-inter 2s agent-addr 172.16.4.20 agent-port 4243

[more backends very similar to this one]

listen stats
    bind *:8181
    stats enable
    stats uri /
    stats realm Haproxy\ Statistics
    stats auth user:pass

Output of haproxy -vv

HAProxy version 2.4.0-6cbbecf 2021/05/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-2.4.0.html
Running on: Linux 4.18.0-348.el8.x86_64 #1 SMP Tue Oct 19 15:14:17 UTC 2021 x86_64
Build options :
  TARGET  = linux-glibc
  CPU     = generic
  CC      = cc
  CFLAGS  = -O2 -g -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
  OPTIONS = USE_OPENSSL=1 USE_LUA=1 USE_SYSTEMD=1
  DEBUG   = 

Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 -CLOSEFROM -ZLIB +SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS -OT -QUIC -PROMEX -MEMORY_PROFILING

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=2).
Built with OpenSSL version : OpenSSL 1.1.1k  FIPS 25 Mar 2021
Running on OpenSSL version : OpenSSL 1.1.1k  FIPS 25 Mar 2021
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.4
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built without PCRE or PCRE2 support (using libc's regex instead)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 8.5.0 20210514 (Red Hat 8.5.0-3)

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
              h2 : mode=HTTP       side=FE|BE     mux=H2       flags=HTX|CLEAN_ABRT|HOL_RISK|NO_UPG
            fcgi : mode=HTTP       side=BE        mux=FCGI     flags=HTX|HOL_RISK|NO_UPG
       <default> : mode=HTTP       side=FE|BE     mux=H1       flags=HTX
              h1 : mode=HTTP       side=FE|BE     mux=H1       flags=HTX|NO_UPG
       <default> : mode=TCP        side=FE|BE     mux=PASS     flags=
            none : mode=TCP        side=FE|BE     mux=PASS     flags=NO_UPG

Available services : none

Available filters :
    [SPOE] spoe
    [CACHE] cache
    [FCGI] fcgi-app
    [COMP] compression
    [TRACE] trace

Last Outputs and Backtraces

No response

Additional Information

OS: Centos8-Stream

Systemd service:

[Unit]
Description=HAProxy loadbalancer
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
ExecStartPre=/usr/local/sbin/haproxy -f /usr/local/etc/haproxy/haproxy.cfg -c -q
ExecStart=/usr/local/sbin/haproxy -Ws -- /usr/local/etc/haproxy/haproxy.cfg
ExecReload=/usr/local/sbin/haproxy -f /usr/local/etc/haproxy/haproxy.cfg -c -q
ExecReload=/bin/kill -USR2 $MAINPID
TimeoutStartSec=30
TimeoutStopSec=30
Restart=always
User=haproxy
Group=haproxy
ProtectHome=yes
ProtectSystem=yes
NoNewPrivileges=yes
AmbientCapabilities=CAP_NET_BIND_SERVICE
LimitNOFILE=35000

[Install]
WantedBy=multi-user.target
lukastribus commented 2 years ago

2.4.9 was released yesterday and contains multiple fixes regarding connection teardown, you may want to give it a try.

arouene commented 2 years ago

With your advice, we just tried the version 2.4.9 with no more luck. Problem still there...

capflam commented 2 years ago

Do you know if it concerns h1 or h2 connections or both ? client or server connections ?

arouene commented 2 years ago

So we restricted connections to the HTTP2 protocol and then to the HTTP1 protocols, while it's better with H1 connections, it is still not perfect.

image

The huge amount of FIN-WAIT-2 we have is between HAProxy and his backends.

ss -tn | grep FIN-WAIT
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53466       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53783       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.14:61827       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.16:59372       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53585       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53457       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53643       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.14:61905       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53421       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53553       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.11:60750       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.11:60683       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.16:59299       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.15:62992       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.11:60672       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53764       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53587       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.11:60649       
...

Backends are IIS 10 Servers, HAProxy use H1 connections with them.

arouene commented 2 years ago

(The yellow curve is the %Ta time from the HAProxy logs)

To diagnose a little further, we got the %TR time from the logs (time to receive the full request from first byte). Surprisingly that's where the time is passed (according to HAProxy), we have some requests that takes seconds to get the full query from the client... For now, I try to get those queries with tcpdump.

capflam commented 2 years ago

Thanks, it could be indeed really helpful. That's surprising because the 2.4 is not so different than 2.3 on the request parsing...

arouene commented 2 years ago

Ok, so I may have something, but I don't know how to interpret it

Here is a request (that's a view from datadog that gets the logs from haproxy) image It says, the total time of the request (Ta) is 2.7s, client request (TR) is taking 1.3s, and the backend respond in 1.4s.

Here is a trace of that request from tcpdump (172.16.3.71 is a virtual IP facing the clients, 172.16.3.73 is the server IP used for contacting the backends) Screenshot from 2021-11-30 12-15-13 The request happens in an already opened TCP session, we can find the 1.3s between the last ACK and the packet of the request.

Does that really mean that the request takes 1.3s to be sent? Could that be a problem from the log system of haproxy? I mean, the services doesn't fell more slow when we are using them with haproxy 2.4.

lukastribus commented 2 years ago

The huge amount of FIN-WAIT-2 we have is between HAProxy and his backends.

ss -tn | grep FIN-WAIT
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53466       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53783     

Seems like those sockets are in from the frontend, not the backend. 172.16.4.71:443 is what you are binding haproxy to in the frontend.

arouene commented 2 years ago

172.16.4.71:443 is what you are binding haproxy to in the frontend.

Yes you are right, but 172.16.4.10 is a private IP of one of our backends.

I don't know why the backends sometimes use the virtual IP, but now you mention it, the FIN-WAIT-2 are always on the port 443 of the virtual ip.

arouene commented 2 years ago

The huge amount of FIN-WAIT-2 we have is between HAProxy and his backends.

ss -tn | grep FIN-WAIT
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53466       
FIN-WAIT-2 0      0               172.16.4.71:443             172.16.4.10:53783     

Seems like those sockets are in from the frontend, not the backend. 172.16.4.71:443 is what you are binding haproxy to in the frontend.

So I got an info from my team, sometimes our backends use haproxy to contact other backends, so you are right, in this case they are the clients !

capflam commented 2 years ago

Ok, so I may have something, but I don't know how to interpret it

Here is a request (that's a view from datadog that gets the logs from haproxy) image It says, the total time of the request (Ta) is 2.7s, client request (TR) is taking 1.3s, and the backend respond in 1.4s.

Here is a trace of that request from tcpdump (172.16.3.71 is a virtual IP facing the clients, 172.16.3.73 is the server IP used for contacting the backends) Screenshot from 2021-11-30 12-15-13 The request happens in an already opened TCP session, we can find the 1.3s between the last ACK and the packet of the request.

Does that really mean that the request takes 1.3s to be sent? Could that be a problem from the log system of haproxy? I mean, the services doesn't fell more slow when we are using them with haproxy 2.4.

Here, the timing reported by HAProxy seems correct. The question is to know why if happens on the 2.4.9 and not on the 2.3.16. It may be just a log issue, but I'm unable to reproduce this behavior. I must check in the code.

Then if you see a FIN-WAIT-2 on HAProxy, it means it is waiting for the connection close from the remote end-point. HAProxy has closed the connection and this was acknowledged. But the peer has not yet closed the connection on his side. Here, it is the client application on your backends. This may come from recent changes about shutdowns but the 2.3.16 is very similar to the 2.4.9 on this point.

About difference on the number of established connections, it may be interesting to compare the reuse rate between the 2.3.16 and 2.4.9. It may explain the difference.

capflam commented 2 years ago

Honestly, I have no explanation about the timing difference you observed between the 2.4.9 and the 2.3.16.

arouene commented 2 years ago

Thanks anyway for searching! I have no explanation either... I will keep one or two haproxy 2.4, and stay on 2.3 for now. If I have more information I will update this issue, but honestly I don't know what to seek or where to look...

If you have any question, don't hesitate.

capflam commented 2 years ago

Ok, I can explain the difference for %TR for HTTP/2 clients. On 2.3 and lower, it is always 0, except if you have an option to wait the request payload in your configuration. On these versions, the time to receive and parse the request headers is included in the idle time (%Ti). Since the 2.4, for HTTP/2 streams, idle time is never set because there is no easy way to measure it per stream. %TR is thus the time to receive the request from the moment the stream is created.

I have no explanation for HTTP/1 clients. But I don't know if there is a difference. After checking your graphs. The first one about %Ta mixes HTTP/2 and HTTP/1 clients. So the difference may be explained by changes on HTTP/2 only. On the last graph, you compare the difference between HTTP/1 and HTTP/2 on the 2.4 only. I suspect there is no difference in HTTP/1 between the 2.3 and the 2.4.

To be sure, you may graph the idle time (%Ti) and the request time (%TR) for HTTP/2 clients. The sum of both should be more or less the same between the 2.3 and the 2.4. For HTTP/1 clients, the graph of the request time (%TR) is enough to verify if there is any difference between versions.

capflam commented 2 years ago

I'm closing this issue with works as designed label because it seems there is no bug here. Feel free to reopen it if I'm wrong.