qchbai / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

DumpProfileLocked -> sscanf -> realloc deadlock #423

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I get this deadlock with glibc-2.14.90-24.fc16.6.x86_64 and gperftools 2.0:
LD_PRELOAD=libtcmalloc.so.4 HEAPPROFILE=foo.hprof my_program

The offending glibc code is:
#5  0x00000036a0c5e3b7 in _IO_vfscanf_internal (s=<optimized out>, 
format=<optimized out>, argptr=0x7fff3b590168, errp=0x0) at vfscanf.c:1389
1389              ADDW (c);
(gdb) list
1384          if (width != 0 && c == L_('0'))
1385            {
1386              if (width > 0)
1387            --width;
1388    
1389              ADDW (c);
1390              c = inchar ();
1391    
1392              if (width != 0 && TOLOWER (c) == L_('x'))
1393            {

#0  0x00007fea5a57a507 in base::internal::SpinLockDelay(int volatile*, int, 
int) () from lib/libtcmalloc.so.4
#1  0x00007fea5a57a3b2 in SpinLock::SlowLock() () from lib/libtcmalloc.so.4
#2  0x00007fea5a575f2a in RecordAlloc () from lib/libtcmalloc.so.4
#3  0x00007fea5a57115a in MallocHook::InvokeNewHookSlow(void const*, unsigned 
long) () from lib/libtcmalloc.so.4
#4  0x00007fea5a57e69b in tc_realloc () from lib/libtcmalloc.so.4
#5  0x00000036a0c5e3b7 in _IO_vfscanf_internal (s=<optimized out>, 
    format=<optimized out>, argptr=0x7fff7e918838, errp=0x0) at vfscanf.c:1389
#6  0x00000036a0c70bd0 in _IO_vsscanf (
    string=0x7fff7e918a90 "00400000-00504000 r-xp 00000000 00:22 5251826", ' ' <repeats 28 times>, "/home/luto/ama/master2/build/bin/fxntrader", 
    format=0x7fea5a587492 "%lx-%lx %4s %lx %x:%x %ld %n", args=0x7fff7e918838)
    at iovsscanf.c:45
#7  0x00000036a0c6aff7 in __sscanf (s=<optimized out>, format=<optimized out>)
    at sscanf.c:34
#8  0x00007fea5a57b250 in ProcMapsIterator::NextExt(unsigned long*, unsigned 
long*, char**, unsigned long*, long*, char**, unsigned long*, unsigned long*, 
unsigned long*, unsigned long*, unsigned long*) () from lib/libtcmalloc.so.4
#9  0x00007fea5a57b56f in ProcMapsIterator::Next(unsigned long*, unsigned 
long*, char**, unsigned long*, long*, char**) () from lib/libtcmalloc.so.4
#10 0x00007fea5a57b79a in tcmalloc::FillProcSelfMaps(char*, int, bool*) () from 
lib/libtcmalloc.so.4
#11 0x00007fea5a57421d in HeapProfileTable::FillOrderedProfile(char*, int) 
const () from lib/libtcmalloc.so.4
#12 0x00007fea5a575a12 in DoGetHeapProfileLocked () from lib/libtcmalloc.so.4
#13 0x00007fea5a575c00 in DumpProfileLocked(char const*) ()
   from lib/libtcmalloc.so.4
#14 0x00007fea5a575d9c in MaybeDumpProfileLocked () from lib/libtcmalloc.so.4
#15 0x00007fea5a575fa6 in RecordAlloc () from lib/libtcmalloc.so.4
#16 0x00007fea5a57115a in MallocHook::InvokeNewHookSlow(void const*, unsigned 
long) () from lib/libtcmalloc.so.4
#17 0x00007fea5a57f580 in tc_new () from lib/libtcmalloc.so.4

This is silly (why is glibc allocating memory?).  Nonetheless, it prevents the 
profiler from working.

Would this all be more robust if the profile write that was triggered from the 
allocator ran in its own thread?

Original issue reported on code.google.com by aml...@gmail.com on 6 Apr 2012 at 9:27

GoogleCodeExporter commented 9 years ago
Interesting find. I wonder if making the lock recursive would be safe? 

Original comment by chapp...@gmail.com on 3 May 2012 at 3:02

GoogleCodeExporter commented 9 years ago
I see you mention that you are using glibc-2.14.90 but can you please double 
check? In speaking with the gnu libc guys, version 2.15 introduced a bug in 
_IO_vfscanf_internal where alloca is avoided altogether and realloc is called 
instead. The intent was for realloc to only be  called for large buffers. In 
2.16 this is fixed such that alloca is still used for small buffers.

Alternatively, perhaps there is an entry in your 'maps' file (/proc/<pid>/maps) 
that is longer than what is normally encountered thereby forcing 
_IO_vfscanf_internal to fall back on realloc.

Original comment by chapp...@gmail.com on 15 May 2012 at 5:31

GoogleCodeExporter commented 9 years ago
Unless Fedora backported that bug, it's 2.14.  I do have fairly long maps, 
though -- the filenames are 100 characters long or so.  They might have 
exceeded 128 bytes when I hit the deadlock.

Original comment by aml...@gmail.com on 1 Jun 2012 at 9:12

GoogleCodeExporter commented 9 years ago
The longest line in /proc/self/maps is 177 bytes, including the trailing 
newline.

Original comment by aml...@gmail.com on 1 Jun 2012 at 9:16

GoogleCodeExporter commented 9 years ago
I'm hitting this bug on Ubuntu Precise with glibc 2.15-0ubuntu10.

The deadlocked stack trace looks like:

Thread 4 (Thread 0x7fd1d005f700 (LWP 1018)):
#0  0x00007fd212762277 in sys_futex (t=0x7fd1d005c840, v=2, o=128, 
a=0x7fd2129994c0) at ./src/base/linux_syscall_support.h:1789
#1  base::internal::SpinLockDelay (w=0x7fd2129994c0, value=2, loop=<optimized 
out>) at ./src/base/spinlock_linux-inl.h:87
#2  0x00007fd2127611a2 in SpinLock::SlowLock (this=0x7fd2129994c0) at 
src/base/spinlock.cc:132
#3  0x00007fd21275cf4a in Lock (this=0x7fd2129994c0) at src/base/spinlock.h:75
#4  SpinLockHolder (l=0x7fd2129994c0, this=<synthetic pointer>) at 
src/base/spinlock.h:141
#5  RecordAlloc (ptr=0x18e1300, bytes=256, skip_count=0) at 
src/heap-profiler.cc:304
#6  0x00007fd21275817a in MallocHook::InvokeNewHookSlow (p=0x18e1300, s=256) at 
src/malloc_hook.cc:525
#7  0x00007fd2127656cb in InvokeNewHook (s=256, p=0x18e1300) at 
src/malloc_hook-inl.h:161
#8  do_realloc_with_callback (invalid_get_size_fn=<optimized out>, 
invalid_free_fn=<optimized out>, new_size=<optimized out>, old_ptr=<optimized 
out>) at src/tcmalloc.cc:1221
#9  do_realloc (new_size=<optimized out>, old_ptr=<optimized out>) at 
src/tcmalloc.cc:1228
#10 tc_realloc (old_ptr=<optimized out>, new_size=256) at src/tcmalloc.cc:1526
#11 0x00007fd21199c66f in _IO_vfscanf_internal (s=<optimized out>, 
format=<optimized out>, argptr=0x7fd1d005cf68, errp=0x0) at vfscanf.c:1413
#12 0x00007fd2119b5360 in _IO_vsscanf (string=0x7fd1d005d1c0 "00400000-0047a000 
r-xp 00000000 fc:00 9965050", ' ' <repeats 28 times>, 
"/home/nelhage/code/codesearch/codesearch", format=0x7fd21276df30 "%lx-%lx %4s 
%lx %x:%x %ld %n", args=0x7fd1d005cf68) at iovsscanf.c:45
#13 0x00007fd2119a0317 in __sscanf (s=<optimized out>, format=<optimized out>) 
at sscanf.c:34
#14 0x00007fd2127609ab in ProcMapsIterator::NextExt (this=0x7fd1d005e5c0, 
start=0x7fd1d005d190, end=0x7fd1d005d198, flags=0x7fd1d005d1b0, 
offset=0x7fd1d005d1a0, inode=0x7fd1d005d1a8, filename=0x7fd1d005d1b8, 
file_mapping=0x0, file_pages=0x0, anon_mapping=0x0, anon_pages=0x0, dev=0x0) at 
src/base/sysinfo.cc:721
#15 0x00007fd212760cbf in ProcMapsIterator::Next (this=<optimized out>, 
start=<optimized out>, end=<optimized out>, flags=<optimized out>, 
offset=<optimized out>, inode=<optimized out>, filename=0x7fd1d005d1b8) at 
src/base/sysinfo.cc:661
#16 0x00007fd212760eea in tcmalloc::FillProcSelfMaps (buf=0x7fd212c1e033 "", 
size=1048556, wrote_all=0x7fd1d005e6cf) at src/base/sysinfo.cc:952
#17 0x00007fd21275b23d in HeapProfileTable::FillOrderedProfile 
(this=0x7fd212d1e040, buf=0x7fd212c1e020 "\nMAPPED_LIBRARIES:\n", size=1048575) 
at src/heap-profile-table.cc:425
#18 0x00007fd21275ca32 in DoGetHeapProfileLocked (buf=0x7fd212c1e020 
"\nMAPPED_LIBRARIES:\n", buflen=1048576) at src/heap-profiler.cc:191
#19 0x00007fd21275cc20 in DumpProfileLocked (reason=<optimized out>) at 
src/heap-profiler.cc:248
#20 0x00007fd21275cdbc in MaybeDumpProfileLocked () at src/heap-profiler.cc:289
#21 0x00007fd21275cfc6 in MaybeDumpProfileLocked () at src/heap-profiler.cc:262
#22 RecordAlloc (ptr=0x33de000, bytes=67108488, skip_count=0) at 
src/heap-profiler.cc:307
#23 0x00007fd21275817a in MallocHook::InvokeNewHookSlow (p=0x33de000, 
s=67108488) at src/malloc_hook.cc:525
#24 0x00007fd212767570 in InvokeNewHook (s=67108488, p=<optimized out>) at 
src/malloc_hook-inl.h:161
#25 tc_newarray (size=67108488) at src/tcmalloc.cc:1566
[application code]

The longest line in /proc/pid/maps is 167 characters, with trailing newline.

Original comment by nelh...@nelhage.com on 2 Jun 2012 at 4:32

GoogleCodeExporter commented 9 years ago
For what it's worth, the attached patch is a moderately sketchy workaround that 
let me successfully use gperftools against my glibc 2.15. I am not proposing it 
for merge, just as a temporary solution for anyone else struggling to make this 
work. I *believe* it to be safe, at least as long as the application does not 
call GetHeapProfile, but I am not completely sure. It works in my relatively 
minimal testing.

Original comment by nelh...@nelhage.com on 2 Jun 2012 at 5:09

Attachments:

GoogleCodeExporter commented 9 years ago
Why does glibc allocate memory in sscanf, anyway?  On quick inspection, the 
Bionic, FreeBSD, and Apple versions don't.  The glibc code is, IMO, some of the 
least comprehensible code I've ever seen.  I have no idea what's going on.

In any case, Apple's implementation is self-contained.  The attached ugly patch 
adds Apple's sscanf and uses it.

I haven't checked the licenses for compatibility, and this patch probably does 
the wrong thing on non-Linux.

Original comment by aml...@gmail.com on 26 Jun 2012 at 11:42

Attachments:

GoogleCodeExporter commented 9 years ago
There's a bug in that patch: apple_scanf.c needs #include <stdlib.h> to avoid 
incorrect code generation.  Sometimes I hate C.

Better version attached.

Original comment by aml...@gmail.com on 27 Jun 2012 at 12:36

Attachments:

GoogleCodeExporter commented 9 years ago
This bug is also hitting on Fedora 17, which uses glibc-2.15.  Basically this 
bug makes tcmalloc's heap-check or heap-profiler useless on Fedora 17 if one 
doesn't compile a glibc-2.16 or patch gperftools.

IMHO, I don't think you (gperftools' developers) should rely on glibc's 
implementation detail of sscanf not calling to realloc.  It will break whenever 
the file name is long, as mentioned in the OP's comment, or whenever glibc guys 
change their mind.

The approaches that come out of my mind (you guys may have better solutions):
1) Make the lock reentrant, which is a lot easier but introduces performance 
penalty;
2) Removes all calls to glibc that will potentially need dynamic memory, and 
provides your own alternatives, which introduces a lot of engineering work.

If I were a gperftools developer, I would first try approach 1 for a quick 
hack, then approach 2 for long term fix, gperftools should be quite stand alone 
and self contained because it is such a low level library.

Original comment by likan...@gmail.com on 2 Jul 2012 at 11:07

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
i got the same bug as "Comment 5 by nelh...@nelhage.com, Jun 1, 2012" metioned 
on Ubuntu Precise. After applying the patch, heap profiler can work but it 
seems heap checker also have a deadlock issue. Here is the trace:

#0  0x00007fa4a2360510 in __nanosleep_nocancel () at 
../sysdeps/unix/syscall-template.S:82
#1  0x00007fa4a2ab87c9 in base::internal::SpinLockDelay (w=0x7fa4a2cce2a0, 
value=2, loop=11037) at ./src/base/spinlock_linux-inl.h:89
#2  0x00007fa4a2ab8a52 in SpinLock::SlowLock (this=0x7fa4a2cce2a0) at 
src/base/spinlock.cc:132
#3  0x00007fa4a2a9f94c in Lock (this=0x7fa4a2cce2a0) at src/base/spinlock.h:75
#4  SpinLockHolder (l=0x7fa4a2cce2a0, this=0x7fffbd2a23d0) at 
src/base/spinlock.h:141
#5  NewHook (ptr=0x1f48000, size=256) at src/heap-checker.cc:577
#6  0x00007fa4a2aae17a in MallocHook::InvokeNewHookSlow (p=0x1f48000, s=256) at 
src/malloc_hook.cc:525
#7  0x00007fa4a2abb6cb in InvokeNewHook (s=256, p=0x1f48000) at 
src/malloc_hook-inl.h:161
#8  do_realloc_with_callback (invalid_get_size_fn=<optimized out>, 
invalid_free_fn=<optimized out>, new_size=<optimized out>,
    old_ptr=<optimized out>) at src/tcmalloc.cc:1221
#9  do_realloc (new_size=<optimized out>, old_ptr=<optimized out>) at 
src/tcmalloc.cc:1228
#10 tc_realloc (old_ptr=<optimized out>, new_size=256) at src/tcmalloc.cc:1526
#11 0x00007fa4a1fec66f in _IO_vfscanf_internal (s=<optimized out>, 
format=<optimized out>, argptr=0x7fffbd2a2998, errp=0x0)
    at vfscanf.c:1413
#12 0x00007fa4a2005360 in _IO_vsscanf (
    string=0x7fffbd2a2c70 "00400000-00408000 r-xp 00000000 fc:00 24166", ' ' <repeats 30 times>, "/path/to/program", format=0x7fa4a2ac3f30 "%lx-%lx %4s %lx %x:%x %ld %n", args=0x7fffbd2a2998) at iovsscanf.c:45
#13 0x00007fa4a1ff0317 in __sscanf (s=<optimized out>, format=<optimized out>) 
at sscanf.c:34
#14 0x00007fa4a2ab69ab in ProcMapsIterator::NextExt (this=0x7fffbd2a4070, 
start=0x7fffbd2a2c38, end=0x7fffbd2a2c40, flags=0x7fffbd2a2c58,
    offset=0x7fffbd2a2c48, inode=0x7fffbd2a2c50, filename=0x7fffbd2a2c60, file_mapping=0x0, file_pages=0x0, anon_mapping=0x0,
    anon_pages=0x0, dev=0x0) at src/base/sysinfo.cc:721
#15 0x00007fa4a2ab6cbf in ProcMapsIterator::Next (this=<optimized out>, 
start=<optimized out>, end=<optimized out>, flags=<optimized out>,
    offset=<optimized out>, inode=<optimized out>, filename=0x7fffbd2a2c60) at src/base/sysinfo.cc:661
#16 0x00007fa4a2aa22a9 in HeapLeakChecker::UseProcMapsLocked 
(proc_maps_task=HeapLeakChecker::DISABLE_LIBRARY_ALLOCS)
    at src/heap-checker.cc:909
#17 0x00007fa4a2aa29f0 in HeapLeakChecker_InternalInitStart () at 
src/heap-checker.cc:2032
#18 0x00007fa4a2a9a1ec in google_init_module_init_start () at 
src/heap-checker.cc:2122
#19 GoogleInitializer (dtor=0, ctor=<optimized out>, name=0x7fa4a2abfcba 
"init_start", this=0x7fa4a2cce2e0) at src/base/googleinit.h:45
#20 __static_initialization_and_destruction_0 (__initialize_p=<optimized out>, 
__priority=<optimized out>) at src/heap-checker.cc:2122
#21 _GLOBAL__sub_I_heap_checker.cc(void) () at src/heap-checker.cc:2372
#22 0x00007fa4a34a6306 in call_init (l=<optimized out>, argc=5, 
argv=0x7fffbd2a4338, env=0x7fffbd2a4368) at dl-init.c:85
#23 0x00007fa4a34a63df in call_init (env=<optimized out>, argv=<optimized out>, 
argc=<optimized out>, l=<optimized out>) at dl-init.c:52
#24 _dl_init (main_map=0x7fa4a36bb2c8, argc=5, argv=0x7fffbd2a4338, 
env=0x7fffbd2a4368) at dl-init.c:134
#25 0x00007fa4a34986ea in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
Hope this helps.

Original comment by xgw.s...@gmail.com on 16 Jul 2012 at 7:37

GoogleCodeExporter commented 9 years ago
It would be great to get these deadlock issues resolved; this is creating a bit 
of pain in the current Ubuntu development release where we would like to gate 
this package on successful execution of its test suite.

Original comment by ringo.p...@gmail.com on 23 Jul 2012 at 2:17

GoogleCodeExporter commented 9 years ago
Moving priority up on this one. I am going to try and at least get something 
temporary in place to get people unblocked on this ugly issue. I'll keep this 
issue updated on my progress.

Original comment by chapp...@gmail.com on 28 Oct 2012 at 8:02

GoogleCodeExporter commented 9 years ago

Original comment by chapp...@gmail.com on 22 Dec 2012 at 5:54