JoeDog / siege

Siege is an http load tester and benchmarking utility
GNU General Public License v3.0
5.9k stars 386 forks source link

[Issue/HELP] Siege Hang after 'Lifting the server siege...' print out #225

Closed stkeke closed 1 year ago

stkeke commented 1 year ago

I probably have not provided enough data for you to do troubleshooting, if you need any more information, please advise, I can reproduce and collect.

Description

We are using latest siege to benchmark our Nginx/Wordpress/Mariadb container group on Ubuntu 22.04 Linux with 132 threads. We frequently came across 'Lifting the server siege...' error with command /usr/local/bin/siege -c200 -t90S -i -b -f/tmp/siege-script-urls-list-1690350042.txt

  1. I just wondered if we gave too big -c option with 200?
  2. siege hang GDB backtrace, see [Hang and Killed] section, detailed debug information see [GDB breakpoint and backtrace] section.

Siege Success and Hang

Script

for i in {1..10}; do 
    date; 
    timeout -s 9 120 siege -t90S -c200 -i -b -f/tmp/siege-script-urls-list-1690350042.txt; 
    date; 
done

Hang and Killed

Wed Jul 26 10:42:06 UTC 2023 SIEGE 4.1.7-b5 Preparing 200 concurrent users for battle. The server is now under siege... Lifting the server siege...Killed Wed Jul 26 10:44:06 UTC 2023

If not kill siege, I found two threads left (another benchmark -c 120).

image

It looks like that a few getaddrinfo() function calls in some threads failed to return, therefore siege could not exit correctly. GDB bt output (more information see below)

Thread 3 (LWP 485478 "siege"):
#0  0x00007f776004e18c in __lll_lock_wait_private () from target:/lib64/libc.so.6
#1  0x00007f7760174b89 in __check_pf () from target:/lib64/libc.so.6
#2  0x00007f7760143324 in getaddrinfo () from target:/lib64/libc.so.6
#3  0x0000000000416841 in new_socket (C=C@entry=0x7f771c000b60, hostparam=<optimized out>, portparam=portparam@entry=8080) at sock.c:156
#4  0x0000000000406946 in __init_connection (this=this@entry=0xc6a860, U=U@entry=0xc67560) at browser.c:931
#5  0x00000000004070f0 in __http (U=0xc67560, this=0xc6a860) at browser.c:478
#6  __request (this=this@entry=0xc6a860, U=U@entry=0xc67560) at browser.c:414
#7  0x00000000004085cd in start (this=0xc6a860) at browser.c:296
#8  0x000000000040b215 in crew_thread (crew=0xcae9b0) at crew.c:141
#9  0x00007f7760f911ca in start_thread () from target:/lib64/libpthread.so.0
#10 0x00007f7760063e73 in clone () from target:/lib64/libc.so.6

Success

Wed Jul 26 10:40:35 UTC 2023 SIEGE 4.1.7-b5 Preparing 200 concurrent users for battle. The server is now under siege... Lifting the server siege... Transactions: 498627 hits Availability: 100.00 % Elapsed time: 90.91 secs Data transferred: 2740.13 MB Response time: 36.35 ms Transaction rate: 5484.84 trans/sec Throughput: 30.14 MB/sec Concurrency: 199.39 Successful transactions: 355959 Failed transactions: 0 Longest transaction: 1150.00 ms Shortest transaction: 0.00 ms Wed Jul 26 10:42:06 UTC 2023

Siege Debug Build

SIEGE_REPO=https://github.com/JoeDog/siege.git
CFLAGS='-g -O0' 
git clone ${SIEGE_REPO} && \
    cd siege && \
    sed -i "s|limit = 255|limit = 1024|" ./doc/siegerc.in && \
    sed -i "s|parser = true|parser = false|" ./doc/siegerc.in && \
    sed -i "s|verbose = true|verbose = false|" ./doc/siegerc.in && \
    sed -i "s|logging = true|logging = false|" ./doc/siegerc.in && \
    utils/bootstrap && \
    ./configure --with-ssl=/usr/bin/openssl && \
    make -j && \
    make install

Siege Version

[root@a864832e0b79 siege]# siege -v SIEGE 4.1.7-b5

GDB breakpoint and backtrace.

I compiled siege with debug options and here is my gdb screen snapshot. If not kill siege, I found two threads left (another benchmark -c 120).

image
(gdb) thread apply all bt
Thread 3 (LWP 485478 "siege"):
#0  0x00007f776004e18c in __lll_lock_wait_private () from target:/lib64/libc.so.6
#1  0x00007f7760174b89 in __check_pf () from target:/lib64/libc.so.6
#2  0x00007f7760143324 in getaddrinfo () from target:/lib64/libc.so.6
#3  0x0000000000416841 in new_socket (C=C@entry=0x7f771c000b60, hostparam=<optimized out>, portparam=portparam@entry=8080) at sock.c:156
#4  0x0000000000406946 in __init_connection (this=this@entry=0xc6a860, U=U@entry=0xc67560) at browser.c:931
#5  0x00000000004070f0 in __http (U=0xc67560, this=0xc6a860) at browser.c:478
#6  __request (this=this@entry=0xc6a860, U=U@entry=0xc67560) at browser.c:414
#7  0x00000000004085cd in start (this=0xc6a860) at browser.c:296
#8  0x000000000040b215 in crew_thread (crew=0xcae9b0) at crew.c:141
#9  0x00007f7760f911ca in start_thread () from target:/lib64/libpthread.so.0
#10 0x00007f7760063e73 in clone () from target:/lib64/libc.so.6

Thread 2 (LWP 485456 "siege"):
#0  0x00007f776004e18c in __lll_lock_wait_private () from target:/lib64/libc.so.6
#1  0x00007f7760174b89 in __check_pf () from target:/lib64/libc.so.6
#2  0x00007f7760143324 in getaddrinfo () from target:/lib64/libc.so.6
#3  0x0000000000416841 in new_socket (C=C@entry=0x7f7588000b60, hostparam=<optimized out>, portparam=portparam@entry=8080) at sock.c:156
#4  0x0000000000406946 in __init_connection (this=this@entry=0xca9da0, U=U@entry=0xc67cb0) at browser.c:931
#5  0x00000000004070f0 in __http (U=0xc67cb0, this=0xca9da0) at browser.c:478
#6  __request (this=this@entry=0xca9da0, U=U@entry=0xc67cb0) at browser.c:414
#7  0x00000000004085cd in start (this=0xca9da0) at browser.c:296
#8  0x000000000040b215 in crew_thread (crew=0xcae9b0) at crew.c:141
#9  0x00007f7760f911ca in start_thread () from target:/lib64/libpthread.so.0
#10 0x00007f7760063e73 in clone () from target:/lib64/libc.so.6

Thread 1 (LWP 485364 "siege"):
#0  0x00007f7760f926cd in __pthread_timedjoin_ex () from target:/lib64/libpthread.so.0
#1  0x000000000040b7b9 in crew_join (crew=crew@entry=0xcae9b0, finish=finish@entry=boolean_true, payload=payload@entry=0x7fff794695b0) at crew.c:280
#2  0x0000000000403965 in main (argc=<optimized out>, argv=<optimized out>) at main.c:507
stkeke commented 1 year ago

This issue is similar to https://github.com/JoeDog/siege/issues/4 and fixed by https://sourceware.org/pipermail/libc-alpha/2023-April/147654.html

stkeke commented 1 year ago

Close this issue.