m-lab / ndt

Network Diagnostic Tool
Other
11 stars 7 forks source link

Web100srv segmentation fault with mutiple test support and port range #76

Closed maximumG closed 7 years ago

maximumG commented 7 years ago

I am using the web100srv binary with the support of multiple concurrent tests and a set of defined TCP ports of the NDTP test.

web100srv --log /var/log/web100srv.log --multiple --mrange=5001:5100

When using option --multiple and option --mrange, we get a Segmentation Fault error in a child process when doing a test using both web-socket based client (ndt-sonar) and the web100clt utility.

The only way to make concurrent tests work is by not having a pre-defined set of TCP port range and let the system decide randomly of TCP port per connection.

[2017-02-09T11:28:08.168722Z pid=20931 loglevel=1        web100srv.c:2454] ANL/Internet2 NDT ver 4.0.0.1
[2017-02-09T11:28:08.168792Z pid=20931 loglevel=1        web100srv.c:2455]  Variables file = /usr/local/ndt/web100_variables
[2017-02-09T11:28:08.168812Z pid=20931 loglevel=1        web100srv.c:2456]  log file = /var/log/web100srv.log
[2017-02-09T11:28:08.168817Z pid=20931 loglevel=1        web100srv.c:2464]  Debug level set to 5
[2017-02-09T11:28:08.168849Z pid=20931 loglevel=3        web100srv.c:2466]  Extended tests options:
[2017-02-09T11:28:08.168856Z pid=20931 loglevel=3        web100srv.c:2469]       * upload: duration = 10000, streams = 1, throughput snapshots: enabled = false, delay = 5000, offset = 1000
[2017-02-09T11:28:08.168862Z pid=20931 loglevel=3        web100srv.c:2472]       * download: duration = 10000, streams = 1, throughput snapshots: enabled = false, delay = 5000, offset = 1000
[2017-02-09T11:28:08.168981Z pid=20931 loglevel=5          network.c:196 ] Send buffer initialized to 87380, 
[2017-02-09T11:28:08.168994Z pid=20931 loglevel=5          network.c:200 ] Receive buffer initialized to 87380
[2017-02-09T11:28:08.169003Z pid=20931 loglevel=1        web100srv.c:2518] server ready on port 3001 (family 0)
[2017-02-09T11:28:08.169121Z pid=20931 loglevel=1      web100-util.c:135 ] web100_init() read 69 variables from file
[2017-02-09T11:28:12.424620Z pid=20931 loglevel=5        web100srv.c:1715] Parent process spawned child = 20932
[2017-02-09T11:28:12.424665Z pid=20931 loglevel=5        web100srv.c:1717] Parent thinks pipe() returned fd0=6, fd1=7
[2017-02-09T11:28:12.424817Z pid=20931 loglevel=5        web100srv.c:1715] Parent process spawned child = 20933
[2017-02-09T11:28:12.424843Z pid=20931 loglevel=5        web100srv.c:1717] Parent thinks pipe() returned fd0=6, fd1=8
[2017-02-09T11:28:12.427528Z pid=20932 loglevel=4        web100srv.c:1690] New connection received from 0x1c24090 [maxime.e-tera.com] sockfd=8.
[2017-02-09T11:28:12.427633Z pid=20932 loglevel=5          logging.c:561 ] Protocol logging is not enabled
[2017-02-09T11:28:12.427640Z pid=20932 loglevel=4        web100srv.c:1705] Child thinks pipe() returned fd0=6, fd1=7 for pid=0
[2017-02-09T11:28:12.430472Z pid=20932 loglevel=1      testoptions.c:340 ] Client version: 3.5.5-

[2017-02-09T11:28:12.430495Z pid=20932 loglevel=1      testoptions.c:355 ] Client connect received from :IP 192.168.4.223 to some server on socket 8
[2017-02-09T11:28:12.431275Z pid=20932 loglevel=5          logging.c:718 ] Protocol logging is not enabled
[2017-02-09T11:28:12.433855Z pid=20932 loglevel=3        web100srv.c:1536] Valid test sequence requested, run test for client=20932
[2017-02-09T11:28:12.433889Z pid=20932 loglevel=4        web100srv.c:738 ] Child process 20932 started
[2017-02-09T11:28:12.434864Z pid=20932 loglevel=5          logging.c:718 ] Protocol logging is not enabled
[2017-02-09T11:28:12.434884Z pid=20932 loglevel=3        web100srv.c:758 ] run_test() routine, asking for test_suite = 2 4 32
[2017-02-09T11:28:12.434893Z pid=20932 loglevel=5          logging.c:718 ] Protocol logging is not enabled
[2017-02-09T11:28:12.434897Z pid=20932 loglevel=1        web100srv.c:762 ] Starting test suite:
[2017-02-09T11:28:12.434902Z pid=20932 loglevel=1        web100srv.c:770 ]  > C2S throughput test
[2017-02-09T11:28:12.434907Z pid=20932 loglevel=1        web100srv.c:776 ]  > S2C throughput test
[2017-02-09T11:28:12.434911Z pid=20932 loglevel=1        web100srv.c:782 ]  > META test
[2017-02-09T11:28:12.434946Z pid=20932 loglevel=1     test_c2s_srv.c:313 ]  <-- 20932 - C2S throughput test -->
[2017-02-09T11:28:12.434980Z pid=20932 loglevel=5          logging.c:508 ] Protocol logging is not enabled
[1486636092 SIGNAL_HANDLER pid=20932 loglevel=1 signal=11 (Segmentation fault)] Received a signal
[1486636092 SIGNAL_HANDLER pid=20931 loglevel=1 signal=17 (Child exited)] Received a signal
[1486636092 SIGNAL_HANDLER pid=20931 loglevel=5 signal=17 (Child exited)] Signal 17 (SIGCHLD) received - completed tests
pboothe commented 7 years ago

Please put this bug report at the parent repo - https://github.com/ndt-project/ndt/issues

maximumG commented 7 years ago

Okay I will open the issue on the parent git repository NDT. Just out of curiousity why such request as your repository is more up to date than the official one ?

Won't it be possible to compile your version of NDT on a production server ?

pboothe commented 7 years ago

M-Lab uses the --disable_extended_tests commandline option, and as such this bug doesn't affect our deployment.

maximumG commented 7 years ago

Just posted the issue on the parent repository.