DD1984 / sockperf

Automatically exported from code.google.com/p/sockperf
Other
1 stars 0 forks source link

segmentation fault when running multithread process #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Host1: sockeprf sr -f <feed> -F s --threds-num <>
2. Host2: sockeprf pp -f <feed> -F s 

multithread process creates segmentation fault on server side
TCP and UDP

What is the expected output? What do you see instead?
Segmentation fault on server side

What version of the product are you using? On what operating system?
2.5.27, RH6

Please provide any additional information below.

Output:

sockperf:  == version #2.5.27 ==                                     
sockperf: No VMA version info                                        
sockperf: [SERVER] listen on:                                        
[ 0] IP = 1.1.1.1         PORT = 12347 # UDP                         
sockperf: Warmup stage (sending a few dummy packets)...              
sockperf: [tid 19668] using select() to block on socket(s)           
sockperf: [SERVER] listen on:                                        
[ 0] IP = 1.1.1.1         PORT = 12349 # UDP                         
sockperf: Warmup stage (sending a few dummy packets)...              
sockperf: [tid 19670] using select() to block on socket(s)           
sockperf: [SERVER] listen on:                                        
[ 0] IP = 1.1.1.1         PORT = 12348 # UDP                         
sockperf: Warmup stage (sending a few dummy packets)...              
sockperf: [tid 19669] using select() to block on socket(s)           
sockperf: [SERVER] listen on:                                        
[ 0] IP = 1.1.1.1         PORT = 12346 # UDP                         
sockperf: Warmup stage (sending a few dummy packets)...              
sockperf: [tid 19667] using select() to block on socket(s)           
sockperf: [SERVER] listen on:                                        
[ 0] IP = 1.1.1.1         PORT = 12345 # UDP                         
sockperf: Warmup stage (sending a few dummy packets)...              
sockperf: [tid 19666] using select() to block on socket(s)           
^Csockperf: Got signal 2 - exiting                                   
sockperf: Total 53040 messages received and handled                  
Segmentation fault (core dumped) 

(gdb) bt
#0  0x00000033af20c670 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000587622 in server_select_per_thread() ()              
#2  0x000000000059a371 in main (

Original issue reported on code.google.com by men...@gmail.com on 29 Mar 2011 at 9:06

GoogleCodeExporter commented 9 years ago
Igor, please notice that Meny used feed file with 5 sockets and run the server 
with 5 threads.
After you fix it, please see if this also fixes Alex Seydin's issue and let us 
know.

Original comment by avne...@gmail.com on 29 Mar 2011 at 9:54

GoogleCodeExporter commented 9 years ago

Original comment by igor.ivanov@itseez.com on 31 Mar 2011 at 7:27

GoogleCodeExporter commented 9 years ago
I can not reproduce issue you described with sockperf 2.5.25 on following 
system as
- Linux version 2.6.18-164.el5 (mockbuild@builder10.centos.org) (gcc version 
4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Thu Sep 3 03:28:30 EDT 2009
- (GNU libc) 2.5
Steps to reproduce:
- launch: $sockperf sr -f udp_10.txt -F s --threads-num=10
- launch: $sockperf pp -f udp_10.txt -F s
- ^C for server after client completion

I suppose seeing backtrace (thanks for that) fault could relate issue described 
at 
http://sourceware.org/bugzilla/show_bug.cgi?id=4509

Meny,
could you set more detail info about used system including libc version.
In addition,
check latest version of sockperf on your system as long as it should include 
additional validation of TID before pthread_kill() call.

Original comment by igor.ivanov@itseez.com on 31 Mar 2011 at 1:51

GoogleCodeExporter commented 9 years ago
Meny,
Have you observed issue on different systems? 
Could you please put core dump of the issue case on bgate into /tmp and send 
info about used system and libc.
Thanks

Original comment by igor.ivanov@itseez.com on 5 Apr 2011 at 6:42

GoogleCodeExporter commented 9 years ago
Igor, I ran it on another O.S. same result.
Try ending server proceess (^c) after client finishes.

Original comment by men...@gmail.com on 5 Apr 2011 at 10:57

GoogleCodeExporter commented 9 years ago
Meny,
I believe I do the same (see comments #3). I have experimented with different 
combinations as 1000records/10threads, 10/10, 1000/100 etc.
It is possible that core dump helps to see root of issue. Send system info you 
tried also (may be the issue is system depended).
Thanks

Original comment by igor.ivanov@itseez.com on 5 Apr 2011 at 11:49

GoogleCodeExporter commented 9 years ago
Here is the bt, hope it helps.

(gdb) thread 1
[Switching to thread 1 (Thread 32481)]#0  0x00000033af20c670 in pthread_kill () 
from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000033af20c670 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000005873bf in server_select_per_thread() ()
#2  0x000000000059a8f1 in main ()
(gdb) thread 2
[Switching to thread 2 (Thread 32482)]#0  0x00000033aeeda093 in select () from 
/lib64/libc.so.6
(gdb) bt
#0  0x00000033aeeda093 in select () from /lib64/libc.so.6
#1  0x000000000058c2e4 in Server<IoSelect, SwitchOff, SwitchOff>::doLoop() ()
#2  0x0000000000593932 in void server_handler<IoSelect, SwitchOff, 
SwitchOff>(int, int, int) ()
#3  0x00000000005878f1 in server_handler_for_multi_threaded(void*) ()
#4  0x00000033af2077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x00000033aeee153d in clone () from /lib64/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 32483)]#0  0x00000033aeeda093 in select () from 
/lib64/libc.so.6
(gdb) bt
#0  0x00000033aeeda093 in select () from /lib64/libc.so.6
#1  0x000000000058c2e4 in Server<IoSelect, SwitchOff, SwitchOff>::doLoop() ()
#2  0x0000000000593932 in void server_handler<IoSelect, SwitchOff, 
SwitchOff>(int, int, int) ()
#3  0x00000000005878f1 in server_handler_for_multi_threaded(void*) ()
#4  0x00000033af2077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x00000033aeee153d in clone () from /lib64/libc.so.6
(gdb) thread 4
[Switching to thread 4 (Thread 32488)]#0  0x00000033aeeda093 in select () from 
/lib64/libc.so.6
(gdb) bt
#0  0x00000033aeeda093 in select () from /lib64/libc.so.6
#1  0x000000000058c2e4 in Server<IoSelect, SwitchOff, SwitchOff>::doLoop() ()
#2  0x0000000000593932 in void server_handler<IoSelect, SwitchOff, 
SwitchOff>(int, int, int) ()
#3  0x00000000005878f1 in server_handler_for_multi_threaded(void*) ()
#4  0x00000033af2077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x00000033aeee153d in clone () from /lib64/libc.so.6

Original comment by men...@gmail.com on 6 Apr 2011 at 8:25

GoogleCodeExporter commented 9 years ago
Checked sockperf r2.5.27, r2.5.43

Fedora14:
Linux fedoravm 2.6.35.6-48.fc14.i686 #1 SMP Fri Oct 22 15:34:36 UTC 2010 i686 
i686 i386 GNU/Linux
(GNU Libc) 2.13

Ubuntu9.04:
Linux ubuntu 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:57:59 UTC 2009 i686 
GNU/Linux
(GNU libc) 2.9

Original comment by igor.ivanov@itseez.com on 6 Apr 2011 at 1:09

GoogleCodeExporter commented 9 years ago
Meny,
Thank you for bt you sent but it does not make issue clearer.
Could you send output of 'uname -a' and 'ldd --version' from the hosts you have 
used.

Original comment by igor.ivanov@itseez.com on 6 Apr 2011 at 1:12

GoogleCodeExporter commented 9 years ago
*uname -a
Linux ronaldo3 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64 
x86_64 x86_64 GNU/Linux

*ldd --version
ldd (GNU libc) 2.11.1
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Original comment by men...@gmail.com on 7 Apr 2011 at 5:30

GoogleCodeExporter commented 9 years ago
Modified thread termination procedure for multi-threads server mode in r46.
Meny,
Could you verify r46 for this issue.

Original comment by igor.ivanov@itseez.com on 8 Apr 2011 at 1:19

GoogleCodeExporter commented 9 years ago
Checked it. No seg fault.
verified with r46

Original comment by men...@gmail.com on 17 Apr 2011 at 9:16

GoogleCodeExporter commented 9 years ago
r46

Original comment by igor.ivanov@itseez.com on 18 Apr 2011 at 7:14