Open ahuston-0 opened 11 months ago
If you still encounter the issue please file an issue on HAProxy directly (https://github.com/haproxy/haproxy/issues)
"Thread X is about to kill the process" means haproxy watchdog noticed that one thread has become unresponsive and to prevent further issues the watchdog decided to abort the process.
Since you're deploying using the :latest
tag, it's very likely that you are hitting a bug or limitation which only happens on haproxy 2.9 (which was just released, see https://github.com/docker-library/haproxy/commit/81e9df259751f1ec391f5c45905c198145e1cc0f) and didn't show up with the previous version (2.8)
For reference, I've managed to get a copy of my haproxy config running with the nixos haproxy service, so the issue is isolated to docker.
If you still encounter the issue please file an issue on HAProxy directly (https://github.com/haproxy/haproxy/issues)
"Thread X is about to kill the process" means haproxy watchdog noticed that one thread has become unresponsive and to prevent further issues the watchdog decided to abort the process.
Since you're deploying using the
:latest
tag, it's very likely that you are hitting a bug or limitation which only happens on haproxy 2.9 (which was just released, see https://github.com/docker-library/haproxy/commit/81e9df259751f1ec391f5c45905c198145e1cc0f) and didn't show up with the previous version (2.8)
I think I've seen the same on haproxy lts but let me file a bug with upstream. Thanks
Any news about this issue? Do you still encounter the crash with :latest
or :2.9.2
tag which where some high cpu usage related bugs were addressed?
Thanks
Hi, I've upgraded to the latest version and will get back in a few hours. I did discover though that the original issue I was having that was causing these logs is some memory/FD bug. If I dont add a nofile limit to the container it was just consuming like 200GB of memory and then crashing (this is on a server with ~256GB of memory available).
This is what I added
ulimits:
nofile:
soft: 1024
hard: 4096
Does it instantly ramp up to 200GB of used memory or is it slowing getting there, which would suggest a leak somewhere?
In the first case, maybe this is not related to haproxy itself but to docker engine update (which removed or increased an existing limit), see https://github.com/haproxy/haproxy/issues/2043. If that's the case, you could also mitigate using maxconn
or fd-hard-limit
global parameters in haproxy config file.
Does it instantly ramp up to 200GB of used memory or is it slowing getting there, which would suggest a leak somewhere?
In the first case, maybe this is not related to haproxy itself but to docker engine update (which removed or increased an existing limit), see https://github.com/haproxy/haproxy/issues/2043. If that's the case, you could also mitigate using
maxconn
orfd-hard-limit
global parameters in haproxy config file.
It was happening over the span of like a minute or two. I can probably time it and get back. I'll check out those two settings and see if they help.
Regarding the CPU utilization issue, I did the upgrade last night and 12 hours in we're at 0.6% CPU so I think that one might have worked.
Does it instantly ramp up to 200GB of used memory or is it slowing getting there, which would suggest a leak somewhere?
In the first case, maybe this is not related to haproxy itself but to docker engine update (which removed or increased an existing limit), see haproxy/haproxy#2043. If that's the case, you could also mitigate using
maxconn
orfd-hard-limit
global parameters in haproxy config file.
It looks like this ended up working. I removed the ulimits
settings in docker-compose.yml
and replaced it with maxconn 60000
in haproxy.cfg and now theres no more high memory utlization and crashing. Then for the other issue having upgraded to 2.9.2 seems to have fixed the high CPU utilization.
Great news, thanks for sharing your positive results with us.
Can we close this as solved?
Not sure if I should file this here or with the upstream, but my haproxy setup was working fine for months until a few days ago. I'm not sure what changed, but now it crashes with "Thread 2 is about to kill the process" and then no traffic goes through.
Environment Details OS: NixOS 23.11 Docker 24.0.5 Storage is OverlayFS2 on ZFS 2.2.2
HAProxy config
Docker Compose file
Crash logs
haproxy-1 | [NOTICE] (1) : New worker (8) forked haproxy-1 | [NOTICE] (1) : Loading success. haproxy-1 | Thread 2 is about to kill the process. haproxy-1 | Thread 1 : id=0x7fc626c57100 act=1 glob=0 wq=0 rq=1 tl=0 tlsz=0 rqsz=1 haproxy-1 | 1/1 stuck=0 prof=0 harmless=0 isolated=0 haproxy-1 | cpu_ns: poll=0 now=6035627479 diff=6035627479 haproxy-1 | curr_task=0 haproxy-1 | *>Thread 2 : id=0x7fc626c4b700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 haproxy-1 | 1/2 stuck=1 prof=0 harmless=0 isolated=0 haproxy-1 | cpu_ns: poll=97660 now=2001614429 diff=2001516769 haproxy-1 | curr_task=0 haproxy-1 | call trace(10): haproxy-1 | | 0x564a2f23d661 [eb ba 66 66 2e 0f 1f 84]: ha_thread_dump+0x91/0x93 haproxy-1 | | 0x564a2f23d8a1 [64 48 8b 53 10 64 48 8b]: ha_panic+0x111/0x3f4 haproxy-1 | | 0x7fc626f86140 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13140 haproxy-1 | | 0x564a2f284282 [48 83 7a 28 00 74 36 48]: fd_reregister_all+0x32/0xb1 haproxy-1 | | 0x564a2f0b5709 [b8 01 00 00 00 5b 5d 41]: main+0x4ec9 haproxy-1 | | 0x564a2f21345e [85 c0 0f 84 5a 03 00 00]: main+0x162c1e haproxy-1 | | 0x7fc626f7aea7 [64 48 89 04 25 30 06 00]: libpthread:+0x7ea7 haproxy-1 | | 0x7fc626e9aa2f [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a