haproxy / haproxy

HAProxy Load Balancer's development branch (mirror of git.haproxy.org)
https://git.haproxy.org/
Other
4.63k stars 770 forks source link

3.0.2 segfault #2617

Open idl0r opened 6 days ago

idl0r commented 6 days ago

Detailed Description of the Problem

3.0.2 crashed for the first time, since I deployed it on all our LBs. It was running for about a week now.

Expected Behavior

No crash

Steps to Reproduce the Behavior

N/A

Do you have any idea what may have caused this?

No response

Do you have an idea how to solve the issue?

No response

What is your configuration?

Can share specific parts on request

Output of haproxy -vv

HAProxy version 3.0.2-a45a8e6 2024/06/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2029.
Known bugs: http://www.haproxy.org/bugs/bugs-3.0.2.html
Running on: Linux 5.10.0-30-amd64 #1 SMP Debian 5.10.218-1 (2024-06-01) x86_64
Build options :
  TARGET  = linux-glibc
  CC      = cc
  CFLAGS  = -O2 -g -fwrapv
  OPTIONS = USE_LIBCRYPT=1 USE_OPENSSL=1 USE_LUA=1 USE_ZLIB= USE_SLZ=1 USE_NS= USE_SYSTEMD=1 USE_PROMEX=1 USE_PCRE= USE_PCRE_JIT= USE_PCRE2=1 USE_PCRE2_JIT=
  DEBUG   = 

Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER -NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT -PCRE +PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -PTHREAD_EMULATION -QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=48).
Built with OpenSSL version : OpenSSL 1.1.1w  11 Sep 2023
Running on OpenSSL version : OpenSSL 1.1.1w  11 Sep 2023
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with the Prometheus exporter as a service
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.36 2020-12-04
PCRE2 library supports JIT : no (USE_PCRE2_JIT not set)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 10.2.1 20210110

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
         h2 : mode=HTTP  side=FE|BE  mux=H2    flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=HTTP  side=FE|BE  mux=H1    flags=HTX
         h1 : mode=HTTP  side=FE|BE  mux=H1    flags=HTX|NO_UPG
       fcgi : mode=HTTP  side=BE     mux=FCGI  flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=TCP   side=FE|BE  mux=PASS  flags=
       none : mode=TCP   side=FE|BE  mux=PASS  flags=NO_UPG

Available services : prometheus-exporter
Available filters :
    [BWLIM] bwlim-in
    [BWLIM] bwlim-out
    [CACHE] cache
    [COMP] compression
    [FCGI] fcgi-app
    [SPOE] spoe
    [TRACE] trace

Last Outputs and Backtraces

Jun 24 17:38:15.049 n095191 haproxy[34512]: [NOTICE]   (34512) : haproxy version is 3.0.2-a45a8e6
Jun 24 17:38:15.049 n095191 haproxy[34512]: [NOTICE]   (34512) : path to executable is /usr/sbin/haproxy
Jun 24 17:38:15.049 n095191 haproxy[34512]: [ALERT]    (34512) : Current worker (3306046) exited with code 139 (Segmentation fault)
Jun 24 17:38:15.049 n095191 haproxy[34512]: [WARNING]  (34512) : A worker process unexpectedly died and this can only be explained by a bug in haproxy or its dependencies.
Jun 24 17:38:15.049 n095191 haproxy[34512]: Please check that you are running an up to date and maintained version of haproxy and open a bug report.
Jun 24 17:38:15.049 n095191 haproxy[34512]: HAProxy version 3.0.2-a45a8e6 2024/06/14 - https://haproxy.org/
Jun 24 17:38:15.049 n095191 haproxy[34512]: Status: long-term supported branch - will stop receiving fixes around Q2 2029.
Jun 24 17:38:15.049 n095191 haproxy[34512]: Known bugs: http://www.haproxy.org/bugs/bugs-3.0.2.html
Jun 24 17:38:15.049 n095191 haproxy[34512]: Running on: Linux 5.10.0-30-amd64 #1 SMP Debian 5.10.218-1 (2024-06-01) x86_64
Jun 24 17:38:15.049 n095191 haproxy[34512]: [ALERT]    (34512) : exit-on-failure: killing every processes with SIGTERM
Jun 24 17:38:15.077 n095191 haproxy[34512]: [WARNING]  (34512) : Former worker (2111438) exited with code 143 (Terminated)
Jun 24 17:38:15.079 n095191 haproxy[34512]: [WARNING]  (34512) : Former worker (3178501) exited with code 143 (Terminated)
Jun 24 17:38:15.079 n095191 haproxy[34512]: [WARNING]  (34512) : All workers exited. Exiting... (139)

https://gist.github.com/idl0r/9c38bdd501a2e1cbdbbef75ba85dc057

Additional Information

I couldn't find anything useful in the Logs other than what I posted. So nothing suspicious about 3306046 for example, which is kinda weird. The last logs about that one particular did some SPOE for our Coraza stuff.

I can share the coredump on request.

capflam commented 5 days ago

Hard to say if it is related but, because you are using stick-tables, you may take a look at #2611. It is a double-free issue. I proposed a patch (https://github.com/haproxy/haproxy/issues/2611#issuecomment-2188111763). I will merge it but in mean time, you can apply it on top of 3.0.2 to check if it also fixes your issue.

idl0r commented 5 days ago

I wasn't sure if it's really that one but I was thinking about it too. I applied the mentioned patch and currently rolling out the patched version. But since it took about a week for this one crash, it might take several days again.