cynthia / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

deadlock when heap profiling #379

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. We have an application with many threads that deadlocks promptly after 
startup

What is the expected output? What do you see instead?
I expect no deadlock :)

What version of the product are you using? On what operating system?

[root@myhost]# uname -a
Linux myhost 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 
x86_64 GNU/Linux

RHEL 5.6
google perf tools 1.8.3
libunwind-0.99-beta

running the application as follows:

env HEAP_PROFILE_ALLOCATION_INTERVAL=100000000 
LD_PRELOAD=/usr/local/lib/libtcmalloc.so HEAPPROFILE=/d2/tmp/google/myapp  myapp

Please provide any additional information below.

I notice that most of the threads in our application end up being stuck in 
sys_futex trying to allocate or free memory. Here are some of those threads 
from a gcore

#0  0x00002b5dfaa8fb3c in sys_futex (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/linux_syscall_support.h:2269
#1  base::internal::SpinLockDelay (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/spinlock_linux-inl.h:86
#2  0x00002b5dfaa8fe8c in SpinLock::SlowLock (this=0x2b5dfad48280) at 
src/base/spinlock.cc:132
#3  0x00002b5dfaa8ccc3 in Lock (ptr=0x1a922c00) at src/base/spinlock.h:75
#4  SpinLockHolder (ptr=0x1a922c00) at src/base/spinlock.h:141
#5  RecordFree (ptr=0x1a922c00) at src/heap-profiler.cc:336
#6  DeleteHook (ptr=0x1a922c00) at src/heap-profiler.cc:354
#7  0x00002b5dfaa858f8 in MallocHook::InvokeDeleteHookSlow (p=0x1a922c00) at 
src/malloc_hook.cc:537
#8  0x00002b5dfaa9317a in InvokeDeleteHook (p=0x1a922c00) at 
src/malloc_hook-inl.h:176
#9  tc_delete (p=0x1a922c00) at src/tcmalloc.cc:1480
#10 0x00002b5e01af872c in __gnu_cxx::new_allocator<char>::deallocate 
(this=0x7fff53c708c7, __p=0x1a922c00 "@")
    at /tmp/buildutils.x86/x86-x86_64-pc-linux-gnu/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:95
#11 0x00002b5e01af7d7f in std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::_Rep::_M_destroy (this=0x1a922c00, __a=...)
    at /tmp/buildutils.x86/x86-x86_64-pc-linux-gnu/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:449
#12 0x00002b5e01af7d30 in std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::_Rep::_M_dispose (this=0x1a922c00, __a=...)
    at /tmp/buildutils.x86/x86-x86_64-pc-linux-gnu/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:237
#13 0x00002b5e01af375b in std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::~basic_string (this=0x7fff53c70970, 
    __in_chrg=<value optimized out>) at /tmp/buildutils.x86/x86-x86_64-pc-linux-gnu/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:525

#0  0x00002b5dfaa8fb3c in sys_futex (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/linux_syscall_support.h:2269
#1  base::internal::SpinLockDelay (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/spinlock_linux-inl.h:86
#2  0x00002b5dfaa8fe8c in SpinLock::SlowLock (this=0x2b5dfad48280) at 
src/base/spinlock.cc:132
#3  0x00002b5dfaa8cd73 in Lock (ptr=0x1ad8ac00, size=24) at 
src/base/spinlock.h:75
#4  SpinLockHolder (ptr=0x1ad8ac00, size=24) at src/base/spinlock.h:141
#5  RecordAlloc (ptr=0x1ad8ac00, size=24) at src/heap-profiler.cc:327
#6  NewHook (ptr=0x1ad8ac00, size=24) at src/heap-profiler.cc:349
#7  0x00002b5dfaa85eae in MallocHook::InvokeNewHookSlow (p=0x1ad8ac00, s=24) at 
src/malloc_hook.cc:533
#8  0x00002b5dfaa9256e in InvokeNewHook (size=24) at src/malloc_hook-inl.h:161
#9  tc_new (size=24) at src/tcmalloc.cc:1469
#10 0x000000000101ebb7 in allocate (signo=17, info=<value optimized out>, 
notused=<value optimized out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/ext/new_allocator.h:89
#11 _M_get_node (signo=17, info=<value optimized out>, notused=<value optimized 
out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:316
#12 _M_create_node<unsigned long> (signo=17, info=<value optimized out>, 
notused=<value optimized out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:478
#13 _M_insert<unsigned long> (signo=17, info=<value optimized out>, 
notused=<value optimized out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:1442
#14 push_back (signo=17, info=<value optimized out>, notused=<value optimized 
out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:926
#15 sig_handler_kevent (signo=17, info=<value optimized out>, notused=<value 
optimized out>) at osiEventEpoll.cpp:1403
#16 <signal handler called>
#17 0x00002b5e020236d0 in sigprocmask () from /lib64/libc.so.6
#18 0x00002b5e02350f24 in put_rs_cache (c=0x5a378940) at dwarf/Gparser.c:525
#19 _ULx86_64_dwarf_find_save_locs (c=0x5a378940) at dwarf/Gparser.c:852
#20 0x00002b5e02351119 in _ULx86_64_dwarf_step (c=0x2) at dwarf/Gstep.c:35
#21 0x00002b5e0235311a in _ULx86_64_step (cursor=0x2) at x86_64/Gstep.c:42
#22 0x00002b5dfaa8f96a in GetStackTrace (result=0x5a379140, max_depth=42, 
skip_count=<value optimized out>) at src/stacktrace_libunwind-inl.h:114
#23 0x00002b5dfaa86176 in MallocHook_GetCallerStackTrace (result=0x5a3792d0, 
max_depth=32, skip_count=<value optimized out>)
    at src/malloc_hook.cc:673
#24 0x00002b5dfaa89f34 in GetCallerStackTrace (this=0x2b5e032f7040, 
ptr=0x1ab6f380, bytes=48, skip_count=-1) at ./src/google/malloc_hook.h:262
#25 HeapProfileTable::RecordAlloc (this=0x2b5e032f7040, ptr=0x1ab6f380, 
bytes=48, skip_count=-1) at src/heap-profile-table.cc:196
#26 0x00002b5dfaa8cd97 in RecordAlloc (ptr=0x1ab6f380, size=48) at 
src/heap-profiler.cc:329
#27 NewHook (ptr=0x1ab6f380, size=48) at src/heap-profiler.cc:349
#28 0x00002b5dfaa85eae in MallocHook::InvokeNewHookSlow (p=0x1ab6f380, s=48) at 
src/malloc_hook.cc:533
#29 0x00002b5dfaa927f1 in InvokeNewHook (size=48) at src/malloc_hook-inl.h:161
#30 tc_malloc (size=48) at src/tcmalloc.cc:1426
#31 0x00002b5dfc6ae9aa in soap_malloc (soap=0x5a3796b0, n=32) at 
stdsoap2.cpp:5539

#0  0x00002b5dfaa8fb3c in sys_futex (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/linux_syscall_support.h:2269
#1  base::internal::SpinLockDelay (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/spinlock_linux-inl.h:86
#2  0x00002b5dfaa8fe8c in SpinLock::SlowLock (this=0x2b5dfad48280) at 
src/base/spinlock.cc:132
#3  0x00002b5dfaa8cd73 in Lock (ptr=0x1ae06040, size=32) at 
src/base/spinlock.h:75
#4  SpinLockHolder (ptr=0x1ae06040, size=32) at src/base/spinlock.h:141
#5  RecordAlloc (ptr=0x1ae06040, size=32) at src/heap-profiler.cc:327
#6  NewHook (ptr=0x1ae06040, size=32) at src/heap-profiler.cc:349
#7  0x00002b5dfaa85eae in MallocHook::InvokeNewHookSlow (p=0x1ae06040, s=32) at 
src/malloc_hook.cc:533
#8  0x00002b5dfaa927f1 in InvokeNewHook (size=32) at src/malloc_hook-inl.h:161
#9  tc_malloc (size=32) at src/tcmalloc.cc:1426
#10 0x00002b5dfc6af636 in soap_link (soap=0x599786b0, p=0x0, t=29, n=-1, 
fdelete=0x2b5dfc6c2850 <soap_fdelete(soap_clist*)>) at stdsoap2.cpp:5766

#0  0x00002b5dfaa8fb3c in sys_futex (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/linux_syscall_support.h:2269
#1  base::internal::SpinLockDelay (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/spinlock_linux-inl.h:86
#2  0x00002b5dfaa8fe8c in SpinLock::SlowLock (this=0x2b5dfad48280) at 
src/base/spinlock.cc:132
#3  0x00002b5dfaa8cd73 in Lock (ptr=0x1ada5d40, size=144) at 
src/base/spinlock.h:75
#4  SpinLockHolder (ptr=0x1ada5d40, size=144) at src/base/spinlock.h:141
#5  RecordAlloc (ptr=0x1ada5d40, size=144) at src/heap-profiler.cc:327
#6  NewHook (ptr=0x1ada5d40, size=144) at src/heap-profiler.cc:349
#7  0x00002b5dfaa85eae in MallocHook::InvokeNewHookSlow (p=0x1ada5d40, s=144) 
at src/malloc_hook.cc:533
#8  0x00002b5dfaa9256e in InvokeNewHook (size=144) at src/malloc_hook-inl.h:161
#9  tc_new (size=144) at src/tcmalloc.cc:1469
#10 0x00002b5dfc21550a in allocate (this=0x58f928b0, k=...)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/ext/new_allocator.h:89
#11 construct_preamble (this=0x58f928b0, k=...) at 
/vobs/fw/contrib/opensource/boost/boost/unordered/detail/util.hpp:319
#12 
boost::unordered_detail::hash_node_constructor<std::allocator<std::pair<boost::a
sio::ip::basic_endpoint<boost::asio::ip::udp> const, unsigned long> >, 
boost::unordered_detail::ungrouped>::construct_pair<boost::asio::ip::basic_endpo
int<boost::asio::ip::udp>, unsigned long> (this=0x58f928b0, 
    k=...) at /vobs/fw/contrib/opensource/boost/boost/unordered/detail/util.hpp:267

#0  0x00002b5dfaa8fb3c in sys_futex (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/linux_syscall_support.h:2269
#1  base::internal::SpinLockDelay (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/spinlock_linux-inl.h:86
#2  0x00002b5dfaa8fe8c in SpinLock::SlowLock (this=0x2b5dfad48280) at 
src/base/spinlock.cc:132
#3  0x00002b5dfaa8cd73 in Lock (ptr=0x1b0b64e0, size=88) at 
src/base/spinlock.h:75
#4  SpinLockHolder (ptr=0x1b0b64e0, size=88) at src/base/spinlock.h:141
#5  RecordAlloc (ptr=0x1b0b64e0, size=88) at src/heap-profiler.cc:327
#6  NewHook (ptr=0x1b0b64e0, size=88) at src/heap-profiler.cc:349
#7  0x00002b5dfaa85eae in MallocHook::InvokeNewHookSlow (p=0x1b0b64e0, s=88) at 
src/malloc_hook.cc:533
#8  0x00002b5dfaa9256e in InvokeNewHook (size=88) at src/malloc_hook-inl.h:161
#9  tc_new (size=88) at src/tcmalloc.cc:1469

#0  0x00002b5dfaa8fb3c in sys_futex (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/linux_syscall_support.h:2269
#1  base::internal::SpinLockDelay (w=0x2b5dfad48280, value=2, loop=<value 
optimized out>) at ./src/base/spinlock_linux-inl.h:86
#2  0x00002b5dfaa8fe8c in SpinLock::SlowLock (this=0x2b5dfad48280) at 
src/base/spinlock.cc:132
#3  0x00002b5dfaa8ccc3 in Lock (ptr=0x18de9680) at src/base/spinlock.h:75
#4  SpinLockHolder (ptr=0x18de9680) at src/base/spinlock.h:141
#5  RecordFree (ptr=0x18de9680) at src/heap-profiler.cc:336
#6  DeleteHook (ptr=0x18de9680) at src/heap-profiler.cc:354
#7  0x00002b5dfaa858f8 in MallocHook::InvokeDeleteHookSlow (p=0x18de9680) at 
src/malloc_hook.cc:537
#8  0x00002b5dfaa9317a in InvokeDeleteHook (p=0x18de9680) at 
src/malloc_hook-inl.h:176
#9  tc_delete (p=0x18de9680) at src/tcmalloc.cc:1480

etc... etc...

I have ran gcore a few times to get multiple dumps and all those threads are 
stuck in that spot.

I am wondering if I am running into the problems stated in the README and 
INSTALL files with regards to x86_64 and this particular version of glibc. I 
think I am, but looking through the previous bug reports I didn't see any of 
the same backtraces as I see here.

I also note that if I run without heap profiling enabled that everything works 
perfectly fine.

Original issue reported on code.google.com by iwor...@gmail.com on 10 Nov 2011 at 2:27

GoogleCodeExporter commented 9 years ago
I suppose it helps if I tell you what version of glibc I have :)

[root@myhost]# yum list | grep libc
glibc.i686                        2.5-65                 installed              
glibc.x86_64                      2.5-65                 installed              
glibc-common.x86_64               2.5-65                 installed              
glibc-devel.x86_64                2.5-65                 installed              
glibc-headers.x86_64              2.5-65                 installed              
libcap.i386                       1.10-26                installed              
libcap.x86_64                     1.10-26                installed              
compat-glibc.i386                 1:2.3.4-2.26           rhel-disc              
compat-glibc.x86_64               1:2.3.4-2.26           rhel-disc              
compat-glibc-headers.x86_64       1:2.3.4-2.26           rhel-disc              
compat-libcom_err.i386            1.0-7                  rhel-disc              
compat-libcom_err.x86_64          1.0-7                  rhel-disc              
glibc-devel.i386                  2.5-65                 rhel-x86_64-server-5   
glibc-utils.x86_64                2.5-65                 rhel-x86_64-server-5   
libc-client.i386                  2004g-2.2.1            rhel-disc              
libc-client.x86_64                2004g-2.2.1            rhel-disc              
libc-client-devel.i386            2004g-2.2.1            rhel-disc              
libc-client-devel.x86_64          2004g-2.2.1            rhel-disc              
libcap-devel.i386                 1.10-26                rhel-disc              
libcap-devel.x86_64               1.10-26                rhel-disc              
libchewing.i386                   0.3.0-8.el5            rhel-disc              
libchewing.x86_64                 0.3.0-8.el5            rhel-disc              
libchewing-devel.i386             0.3.0-8.el5            rhel-disc              
libchewing-devel.x86_64           0.3.0-8.el5            rhel-disc              
libcmpiutil.i386                  0.5.1-1.el5            rhel-x86_64-server-vt-5
libcmpiutil.x86_64                0.5.1-1.el5            rhel-x86_64-server-vt-5
libcmpiutil-devel.i386            0.5.1-1.el5            rhel-x86_64-server-vt-5
libcmpiutil-devel.x86_64          0.5.1-1.el5            rhel-x86_64-server-vt-5
libcollection.i386                0.6.0-10.el5           rhel-x86_64-server-5   
libcollection.x86_64              0.6.0-10.el5           rhel-x86_64-server-5   
libcollection-devel.i386          0.6.0-10.el5           rhel-x86_64-server-5   
libcollection-devel.x86_64        0.6.0-10.el5           rhel-x86_64-server-5   
libcroco.i386                     0.6.1-2.1              rhel-disc              
libcroco.x86_64                   0.6.1-2.1              rhel-disc              
libcroco-devel.i386               0.6.1-2.1              rhel-disc              
libcroco-devel.x86_64             0.6.1-2.1              rhel-disc              
libcxgb3.i386                     1.2.5-2.el5            rhel-disc              
libcxgb3.x86_64                   1.2.5-2.el5            rhel-disc              
libcxgb3-devel.i386               1.0-6.el5_1.1          rhel-x86_64-server-5   
libcxgb3-devel.x86_64             1.0-6.el5_1.1          rhel-x86_64-server-5   
libcxgb3-static.x86_64            1.2.5-2.el5            rhel-disc              
libcxgb4.i386                     1.1.1-2.el5            rhel-x86_64-server-5   
libcxgb4.x86_64                   1.1.1-2.el5            rhel-x86_64-server-5   
libcxgb4-static.x86_64            1.1.1-2.el5            rhel-x86_64-server-5   
[root@myhost]# 

Original comment by iwor...@gmail.com on 10 Nov 2011 at 2:43

GoogleCodeExporter commented 9 years ago
They're all blocked in sys_futex because they're all waiting on the same lock 
-- so that part is expected.  If you allocate/deallocate a lot, it wouldn't be 
100% surprising to occassionally find all-but-one thread waiting in sys_futex, 
while the last thread actually makes some progress.  But here, I'm guessing all 
the threads are stuck in sys_futex.

The problem I see, in the thread backtraces you attached (I take it that's only 
a subset?), is this one:
---

#10 0x000000000101ebb7 in allocate (signo=17, info=<value optimized out>, 
notused=<value optimized out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/ext/new_allocator.h:89
#11 _M_get_node (signo=17, info=<value optimized out>, notused=<value optimized 
out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:316
#12 _M_create_node<unsigned long> (signo=17, info=<value optimized out>, 
notused=<value optimized out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:478
#13 _M_insert<unsigned long> (signo=17, info=<value optimized out>, 
notused=<value optimized out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:1442
#14 push_back (signo=17, info=<value optimized out>, notused=<value optimized 
out>)
    at /vobs/fw-linux-tools/host/x86_64-pc-linux-gnu/bin/../lib/gcc/x86_64-pc-linux-gnu/4.5.1/../../../../x86_64-pc-linux-gnu/include/c++/4.5.1/bits/stl_list.h:926
#15 sig_handler_kevent (signo=17, info=<value optimized out>, notused=<value 
optimized out>) at osiEventEpoll.cpp:1403
#16 <signal handler called>
#17 0x00002b5e020236d0 in sigprocmask () from /lib64/libc.so.6
#18 0x00002b5e02350f24 in put_rs_cache (c=0x5a378940) at dwarf/Gparser.c:525
#19 _ULx86_64_dwarf_find_save_locs (c=0x5a378940) at dwarf/Gparser.c:852
#20 0x00002b5e02351119 in _ULx86_64_dwarf_step (c=0x2) at dwarf/Gstep.c:35
#21 0x00002b5e0235311a in _ULx86_64_step (cursor=0x2) at x86_64/Gstep.c:42
---
This is the libunwind code, doing allocations in a signal handler.  That is not 
allowed in C (or C++), and would explain the deadlock.

I am almost positive this is an instance of issue 66, but won't mark it as so 
until we confirm it.

There are two possible solutions:
1) Try running with a version of preftools configured with 
--enable-frame-pointers rather than libunwind (at least, just to confirm that 
libunwind is the problem).

2) Try using the latest version of libunwind (1.0.1?) -- it has a different set 
of problems than libunwind 0.99, that may not affect you.  The latest svn-trunk 
INSTALL file talks about this a bit more:
   http://google-perftools.googlecode.com/svn/trunk/INSTALL

Original comment by csilv...@gmail.com on 10 Nov 2011 at 5:45

GoogleCodeExporter commented 9 years ago
You are correct that this is a subset. There are 59 threads and I was lazy. I 
can provide them all if you feel it will be useful.

In the meantime, I will try your two suggestions and get back to you within a 
few hours.

Original comment by iwor...@gmail.com on 10 Nov 2011 at 6:03

GoogleCodeExporter commented 9 years ago
I should add that I looked at issue 66 as well. I tried most of the stuff 
suggested in there, but will give the suggestions another try and will itemize 
them here as I go.

Original comment by iwor...@gmail.com on 10 Nov 2011 at 6:06

GoogleCodeExporter commented 9 years ago
Steps taken:

make uninstall libunwind
make uninstall google-perftools
configure google-perftools with ./configure --enable-frame-pointers
make install google-perftools
start myapp

The below is 'sanitized' output to keep my corporate overlords happy. If you 
need any clarification on paths or names please ask. It is possible I may have 
made an error during sanitizing :(

I see that we are apparently dumping to the heap files (and apparently 
allocating and freeing LOTS of memory?). I also see a few complaints about 'no 
version information available'.

Nov 10 13:22:18 myhost myapp_out: /usr/local/bin/myapp: 
/usr/local/lib64/libstdc++.so.6: no version information available (required by 
/usr/local/lib/libtcmalloc.so)
Nov 10 13:22:18 myhost myapp_out: /usr/local/bin/myapp: 
/usr/local/lib64/libstdc++.so.6: no version information available (required by 
/usr/local/lib/libtcmalloc.so)
Nov 10 13:22:22 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0001.heap (95 MB allocated cumulatively, 75 MB currently 
in use)
Nov 10 13:22:22 myhost myapp_out: mkdir: /usr/local/lib64/libstdc++.so.6: no 
version information available (required by usr/local/lib/libtcmalloc.so)
Nov 10 13:22:22 myhost myapp_out: mkdir: /usr/local/lib64/libstdc++.so.6: no 
version information available (required by usr/local/lib/libtcmalloc.so)
Nov 10 13:22:22 myhost myapp_out: Starting tracking the heap
Nov 10 13:22:22 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp_10523.0001.heap (Exiting)
Nov 10 13:22:24 myhost myapp_out: sh: /usr/local/lib64/libstdc++.so.6: no 
version information available (required by /usr/local/lib/libtcmalloc.so)
Nov 10 13:22:24 myhost myapp_out: sh: /usr/local/lib64/libstdc++.so.6: no 
version information available (required by /usr/local/lib/libtcmalloc.so)
Nov 10 13:22:24 myhost myapp_out: Starting tracking the heap
Nov 10 13:22:24 myhost myapp_out: cp: /usr/local/lib64/libstdc++.so.6: no 
version information available (required by /usr/local/lib/libtcmalloc.so)
Nov 10 13:22:24 myhost myapp_out: cp: /usr/local/lib64/libstdc++.so.6: no 
version information available (required by /usr/local/lib/libtcmalloc.so)
Nov 10 13:22:24 myhost myapp_out: Starting tracking the heap
Nov 10 13:22:24 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp_10724.0001.heap (Exiting)
Nov 10 13:22:27 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0002.heap (190 MB allocated cumulatively, 84 MB currently 
in use)
Nov 10 13:22:32 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0003.heap (286 MB allocated cumulatively, 83 MB currently 
in use)
Nov 10 13:22:38 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0004.heap (381 MB allocated cumulatively, 84 MB currently 
in use)
Nov 10 13:22:44 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0005.heap (476 MB allocated cumulatively, 83 MB currently 
in use)
Nov 10 13:22:50 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0006.heap (572 MB allocated cumulatively, 84 MB currently 
in use)
Nov 10 13:22:56 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0007.heap (667 MB allocated cumulatively, 84 MB currently 
in use)
Nov 10 13:23:02 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0008.heap (762 MB allocated cumulatively, 84 MB currently 
in use)
Nov 10 13:23:08 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0009.heap (858 MB allocated cumulatively, 84 MB currently 
in use)
Nov 10 13:23:14 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0010.heap (953 MB allocated cumulatively, 84 MB currently 
in use)
...
Nov 10 13:42:44 myhost myapp_out: Dumping heap profile to 
/d2/tmp/google/myapp.0211.heap (20122 MB allocated cumulatively, 85 MB 
currently in use)

run pprof as follows

[root@myhost ~]$ pprof --dot --base /d2/tmp/google/myapp.0001.heap 
/usr/local/bin/myapp /d2/tmp/google/myapp.0011.heap > ~/myapp.dot
Using local file /usr/local/bin/myapp.
Using local file /d2/tmp/google/myapp.heap.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
... <about 200 or so of these>
Dropping nodes with <= 0.0 MB; edges with <= 0.0 abs(MB)

When I look at the resulting dot file in gvedit it claims that there is only 
about 9.1MB allocated :(

HOWEVER, it appears that my application is now running properly. I guess that 
validates the theory that libunwind is the culprit.

I will now try with the newest version of libunwind after perusing the latest 
INSTALL file. I will also do some googling of the Dwarf Error stuff.

Original comment by iwor...@gmail.com on 10 Nov 2011 at 6:47

GoogleCodeExporter commented 9 years ago
My application deadlocks with libunwind 1.0.1

No luck there :(

Original comment by iwor...@gmail.com on 10 Nov 2011 at 7:10

GoogleCodeExporter commented 9 years ago
I'm sorry libunwind is giving you such trouble.  Unfortunately, I don't know of 
a good solution for that.  Did you add in the function calls in main() to 
pre-cache some libunwind data, like the newsgroup post suggested?   I had hoped 
that would be enough, but I'm not totally surprised it isn't.

} BFD: Dwarf Error: mangled line number section.

Weird, I have no idea what is causing this.  But for now I'm going with the 
theory that this is responsible for the weird dot output...

Original comment by csilv...@gmail.com on 10 Nov 2011 at 7:31

GoogleCodeExporter commented 9 years ago
I suspect the Dwarf Error is the cause of the weird dot output as well.

I am working on building our product now with the changes outlined at 
http://groups.google.com/group/google-perftools/msg/2686d9f24ac4365f

Original comment by iwor...@gmail.com on 10 Nov 2011 at 7:54

GoogleCodeExporter commented 9 years ago
I believe I have now gotten our product built with -fno-omit-frame-pointer

Unfortunately I still get the following when running pprof:

BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
BFD: Dwarf Error: mangled line number section.
...

Further, the emitted dot file claims a very small number of allocations even 
though the heap dumping notifications state more was allocated.

Tomorrow I will try to verify I built everything with -fno-omit-frame-pointer 
properly and continue on with some of the other suggestions outlined here.

Original comment by iwor...@gmail.com on 10 Nov 2011 at 11:14

GoogleCodeExporter commented 9 years ago
I have to put this on hold for a few days. I hope to return to it later this 
week.

Original comment by iwor...@gmail.com on 14 Nov 2011 at 7:25

GoogleCodeExporter commented 9 years ago
Ping -- I gave you a bit more time than a week. :-)  Any more news on this?

Original comment by csilv...@gmail.com on 25 Jan 2012 at 11:54

GoogleCodeExporter commented 9 years ago
hehe. You are correct. I was given more than a week :)

Unfortunately, we haven't made much progress. The project got put on the 
backburner a little bit. I expect to pick it up again when we migrate to a 
newer version of linux.

For the time being I think we can close this. If we have issues with the new 
release we will raise them then.

Original comment by iwor...@gmail.com on 27 Jan 2012 at 4:09

GoogleCodeExporter commented 9 years ago
OK, closing for now.  Feel free to reopen if necessary.

Original comment by csilv...@gmail.com on 31 Jan 2012 at 5:55