cause the segment fault in tcmallloc

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1.this this problem produced  randomly

What is the expected output? What do you see instead?
 core dumped

What version of the product are you using? On what operating system?

gperftools 2.1  
kernal 2.6.32_1-10-6-0  redhat 

Please provide any additional information below.

the stack
(gdb) bt
#0  0x000000302af2e2ed in raise () from /lib64/tls/libc.so.6
#1  0x000000302af2fa3e in abort () from /lib64/tls/libc.so.6
#2  0x000000000079115f in ccdb::OnSignal (signal_num=11) at 
unitserver/unit_server.cpp:1423
#3  <signal handler called>
#4  GetStackTrace (result=0x7298e560, max_depth=31, skip_count=0) at 
src/stacktrace_x86-inl.h:325
#5  0x0000000000b53e92 in DoSampledAllocation (size=48) at src/tcmalloc.cc:966
#6  0x0000000000b55be0 in (anonymous namespace)::do_malloc_no_errno 
(size=Variable "size" is not available.
) at src/tcmalloc.cc:1083
#7  0x0000000000b71d5f in tc_new (size=48) at src/tcmalloc.cc:1423
#8  0x0000000000a1a223 in std::_Rb_tree<unsigned long long, std::pair<unsigned 
long long const, baidu::hulu::saber::AsyncContext*>, 
std::_Select1st<std::pair<unsigned long long const, 
baidu::hulu::saber::AsyncContext*> >, std::less<unsigned long long>, 
std::allocator<std::pair<unsigned long long const, 
baidu::hulu::saber::AsyncContext*> > >::_M_insert (
    this=0x1ac2d08, __x=0x0, __p=0x7f8f0a01d320, __v=@0x7298ee10) at /usr/lib/gcc/x86_64-redhat-linux/3.4.5/../../../../include/c++/3.4.5/ext/new_allocator.h:81
#9  0x0000000000a1a306 in std::_Rb_tree<unsigned long long, std::pair<unsigned 
long long const, baidu::hulu::saber::AsyncContext*>, 
std::_Select1st<std::pair<unsigned long long const, 
baidu::hulu::saber::AsyncContext*> >, std::less<unsigned long long>, 
std::allocator<std::pair<unsigned long long const, 
baidu::hulu::saber::AsyncContext*> > >::insert_unique (
    this=0x1ac2d08, __v=@0x7298ee10) at /usr/lib/gcc/x86_64-redhat-linux/3.4.5/../../../../include/c++/3.4.5/bits/stl_pair.h:85
#10 0x0000000000a1948c in baidu::hulu::saber::ExecMan::DelayExec 
(this=0x1ac2d00, action=Variable "action" is not available.
) at 
/usr/lib/gcc/x86_64-redhat-linux/3.4.5/../../../../include/c++/3.4.5/bits/stl_ma
p.h:360

(gdb) f 4
#4  GetStackTrace (result=0x7298e560, max_depth=31, skip_count=0) at 
src/stacktrace_x86-inl.h:325
325     src/stacktrace_x86-inl.h: No such file or directory.
        in src/stacktrace_x86-inl.h
(gdb) disassemble 
Dump of assembler code for function GetStackTrace(void**, int, int):
   0x0000000000b605d0 <+0>:     push   %rbp
   0x0000000000b605d1 <+1>:     mov    %edx,%r8d
   0x0000000000b605d4 <+4>:     mov    %rsp,%rbp
   0x0000000000b605d7 <+7>:     mov    %rbp,%rcx
   0x0000000000b605da <+10>:    xor    %r10d,%r10d
   0x0000000000b605dd <+13>:    test   %rcx,%rcx
   0x0000000000b605e0 <+16>:    setne  %dl
   0x0000000000b605e3 <+19>:    cmp    %esi,%r10d
   0x0000000000b605e6 <+22>:    setl   %al
   0x0000000000b605e9 <+25>:    test   %dl,%al
   0x0000000000b605eb <+27>:    je     0xb6063b <GetStackTrace(void**, int, int)+107>
   0x0000000000b605ed <+29>:    data32 xchg %ax,%ax
=> 0x0000000000b605f0 <+32>:    mov    0x8(%rcx),%r9
   0x0000000000b605f4 <+36>:    test   %r9,%r9

(gdb) p sp  
$4 = (void **) 0x72990770
(gdb) p *sp  
$6 = (void *) 0x0
(gdb) p $rcx
$5 = 1922631536

in source code
file stacktrace_x86-inl.h
line:325  if (*(sp+1) == reinterpret_cast<void *>(0)) {   

i print the *sp and means i can access this address ,but why it cause the 
segment fault ?

Original issue reported on code.google.com by baimus...@gmail.com on 8 Apr 2014 at 5:05

GoogleCodeExporter commented 9 years ago

The first thing to ask in such case is "Does your program have any memory 
allocation or thread race issues?" Did you run valgrind plugins memcheck and 
helgrind? In case if it produces any error messages you need to fix them first.

Original comment by yuriv...@gmail.com on 10 Apr 2014 at 9:28

GoogleCodeExporter commented 9 years ago

Crash happened when trying to capture backtrace. It could be caused by memory 
corruption by program (as pointed out by user yurivict). But could also be 
caused by imperfect backtrace capturing code. Note that on amd64 it needs all 
code (including system libraries) to be built with -fenable-frame-pointer.

That you can access *sp doesn't mean that *(sp+1) is accessible. You can see 
that instruction that caused segfault looks exactly like instruction to read 
*(<some variable> + 1).

Original comment by alkondratenko on 18 May 2014 at 6:11

GoogleCodeExporter commented 9 years ago

Forgot to note. It would be interesting if you could try libunwind-based stack 
trace capturing.

Original comment by alkondratenko on 18 May 2014 at 6:11

GoogleCodeExporter commented 9 years ago

Ping. Any news ? Can you try non-frame-pointer-register-based unwinder ?

Original comment by alkondratenko on 28 Jun 2014 at 8:23

HushengGen / gperftools

cause the segment fault in tcmallloc #618