qchbai / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Signal Raised in tcmalloc #408

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
It can be reproduced everytime when my program calls the tc_malloc api in my 
environment.

What is the expected output? What do you see instead?
No signal to be raised from tcmalloc.

What version of the product are you using? On what operating system?
tcmalloc: 1.2
system: 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 
GNU/Linux

Please provide any additional information below.
When my application starts, it gets SIGSEGV signal raised, but the application 
doesn't exit, instead it hanging inside the signal hander.
So, i decide to attach the application with "gdb attatch -p <process-id>", and 
i get the stack trace below:
Thread 16 (Thread 0x453e6940 (LWP 12058)):
#0  0x00002abbed063376 in sys_futex () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#1  0x00002abbed06353a in base::internal::SpinLockDelay () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#2  0x00002abbed063850 in SpinLock::SlowLock () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#3  0x00002abbed053d1b in SpinLock::Lock () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#4  0x00002abbed058e07 in tcmalloc::CentralFreeList::RemoveRange () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#5  0x00002abbed05e7b5 in tcmalloc::ThreadCache::FetchFromCentralCache () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#6  0x00002abbed05531b in tcmalloc::ThreadCache::Allocate () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#7  0x00002abbed051fcd in (anonymous namespace)::do_malloc () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#8  0x00002abbed0527a3 in (anonymous namespace)::do_malloc_or_cpp_alloc () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#9  0x00002abbed065c91 in tc_malloc () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#10 0x0000003b9fe0cfa1 in _dl_signal_error () from /lib64/ld-linux-x86-64.so.2
#11 0x0000003b9fe0d134 in _dl_signal_cerror () from /lib64/ld-linux-x86-64.so.2
#12 0x0000003b9fe09beb in _dl_lookup_symbol_x () from 
/lib64/ld-linux-x86-64.so.2
#13 0x0000003ba0308d0d in call_dl_lookup () from /lib64/libc.so.6
#14 0x0000003b9fe0ce96 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#15 0x0000003ba030916c in do_sym () from /lib64/libc.so.6
#16 0x0000003ba0a01104 in dlsym_doit () from /lib64/libdl.so.2
#17 0x0000003b9fe0ce96 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#18 0x0000003ba0a0150d in _dlerror_run () from /lib64/libdl.so.2
#19 0x0000003ba0a010ba in dlsym () from /lib64/libdl.so.2
#20 0x00002abbea9a1676 in sskgds_save_text_start_end () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#21 0x00002abbea996c5f in skgdsinit () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#22 0x00002abbeb842c46 in kgdsdsts_extra () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#23 0x00002abbeb844af9 in kgdsdsts () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#24 0x00002abbebd277be in kpedbg_dmp_stack () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#25 0x00002abbebd27944 in kpeDbgCrash () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#26 0x00002abbebd27c9d in kpeDbgSignalHandler () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#27 0x00002abbeba8563e in skgesig_sigactionHandler () from 
/home/appSmutest/smu/lib/libclntsh.so.11.1
#28 <signal handler called>
#29 0x00002abbed058d56 in tcmalloc::CentralFreeList::FetchFromSpans () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#30 0x00002abbed058f01 in tcmalloc::CentralFreeList::RemoveRange () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#31 0x00002abbed05e7b5 in tcmalloc::ThreadCache::FetchFromCentralCache () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#32 0x00002abbed05531b in tcmalloc::ThreadCache::Allocate () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#33 0x00002abbed051fcd in (anonymous namespace)::do_malloc () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#34 0x00002abbed0527a3 in (anonymous namespace)::do_malloc_or_cpp_alloc () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#35 0x00002abbed065c91 in tc_malloc () from 
/home/appSmutest/smu/lib/libtcmalloc_minimal.so.4
#36 0x0000000000425939 in icg_user_new_node () at ../smucomm/smu_logic.c:6595
#37 0x0000000000423b77 in create_user_info (tel=0x453e5fd0 "15820654213", 
locnum=0x453e5fc0 "757", tmp_user_info=0x453e5f90, user_hash=0x28649800, 
    mutex=0x0, flag=1 '\001') at ../smucomm/smu_logic.c:5989
#38 0x00000000004236df in load_user_data_db (mod=0x1bfe8000, 
user_hash=0x28649800, fd=203, mutex=0x0) at ../smucomm/smu_logic.c:5882
#39 0x00000000004231a3 in full_load_data (mod=0x1bfe8000, list=0x28353860) at 
../smucomm/smu_logic.c:5784
#40 0x0000000000421e12 in load_db_file (mod=0x1bfe8000) at 
../smucomm/smu_logic.c:5429
#41 0x0000000000421bdc in oper_logic_loaddb_entry (arg=0x1bfe8000) at 
../smucomm/smu_logic.c:5380
#42 0x0000003ba0e064a7 in start_thread () from /lib64/libpthread.so.0
#43 0x0000003ba02d3c2d in clone () from /lib64/libc.so.6

The problem is much like "Issue 203:    Signal Raised in tcmalloc (fetchfromspans 
method)", but my application is wrote in C, not in C++. 

Something about tcmalloc
1、I compiled it with CXXFLAGS=-DTCMALLOC_LARGE_PAGES;
2、My application is linked with tcmalloc-minimal;

Original issue reported on code.google.com by qinqgq...@gmail.com on 1 Mar 2012 at 2:04

GoogleCodeExporter commented 9 years ago
I'm sorry, my tcmalloc version is 2.0 not 1.2.

Original comment by qinqgq...@gmail.com on 1 Mar 2012 at 2:08

GoogleCodeExporter commented 9 years ago
Can you please run you program from within GDB. I believe you will need to set 
an environment variable to tell tcmalloc it is running under gdb. I don't 
recall off hand but grep under the src directory for "GDB". That should tell 
you what environment variable to set. Hopefully this way we can see who 
actually raised the SIGSEGV. Also, can you check the backtrace of all threads 
to see if a thread is missing.

-Dave 

Original comment by chapp...@gmail.com on 2 Mar 2012 at 6:59

GoogleCodeExporter commented 9 years ago
Thanks for your reply, Dave. I will have a try according to your comments. By 
the way, i have checked all threads, and all threads are ok except thread 16.

Original comment by qinqgq...@gmail.com on 5 Mar 2012 at 1:46

GoogleCodeExporter commented 9 years ago
Taking a closer look at things above, it looks like a recursive call is being 
made into tcmalloc::CentralFreeList::RemoveRange which is using a futex. If 
memory serves me correctly futexes are not recursive. Not sure that there is 
much we can do about this one since the culprit here is really the 'kpe*'.

Also, you may find it easier to use gcore to get a core of the hung process and 
then use gdb to inspect the core.

Original comment by chapp...@gmail.com on 23 Dec 2012 at 3:16

GoogleCodeExporter commented 9 years ago
you're doing dlsym from inside signal handler. Because this thing calls malloc 
internally it's not async-signal safe. So there's nothing to fix in tcmalloc 
and you may get exactly same issue with any other malloc.

Original comment by alkondratenko on 29 Aug 2013 at 4:48