Gwinel / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Hang while running make check using 2.1 source distrib on RedHat 6.4 64-bit with libunwind-0.99-beta #579

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
* What steps will reproduce the problem?

1. wget libunwind-0.99-beta.tar.gz, configure, build, install to custom dir
2. wget gperftools-2.1.tar.gz, configure, build
3. make check

* What is the expected output? What do you see instead?

Make check runs, then hangs. Last 4 lines of output are:

PASS: heap-profiler_unittest.sh
PASS: simple_compat_test
rm -f heap-checker_unittest.sh
cp -p ./src/tests/heap-checker_unittest.sh heap-checker_unittest.sh

* What version of the product are you using? On what operating system?
2.1, on RedHat 6.4 with GCC 4.4.7.

Please provide any additional information below.

$ ps -fuddevienne
UID        PID  PPID  C STIME TTY          TIME CMD
4038     15244 15241  0 11:12 ?        00:00:00 sshd: ddevienne@pts/1
4038     15245 15244  0 11:12 pts/1    00:00:00 -tcsh
4038     21386 15245  0 12:01 pts/1    00:00:00 make check
4038     21422 21386  0 12:01 pts/1    00:00:00 make check-TESTS
4038     21429 21422  0 12:01 pts/1    00:00:00 /bin/sh -c set +e; 
bases='low_level_alloc_unittest.log atomicops_unittest.log stacktrace_unittest.
4038     21438 21429  0 12:01 pts/1    00:00:00 make test-suite.log 
TEST_LOGS=low_level_alloc_unittest.log atomicops_unittest.log stacktrace_unitt
4038     24150 21438  0 12:02 pts/1    00:00:00 /bin/sh -c 
p='heap-checker_unittest.sh'; \?b='heap-checker_unittest.sh'; \?case $- in *e*) 
set +e;
4038     24154 24150  0 12:02 pts/1    00:00:00 /bin/sh ./test-driver 
--test-name heap-checker_unittest.sh --log-file heap-checker_unittest.sh.log
4038     24155 24154  0 12:02 pts/1    00:00:00 /bin/sh 
./heap-checker_unittest.sh
4038     24192 24155  0 12:02 pts/1    00:00:01 
/oradata1/ddevienne/oss/gperftools-2.1/.libs/lt-heap-checker_unittest
4038     24465 25416  0 13:01 pts/0    00:00:00 ps -fuddevienne
4038     25415 25413  0 Sep30 ?        00:00:01 sshd: ddevienne@pts/0
4038     25416 25415  0 Sep30 pts/0    00:00:00 -tcsh

$ gdb /oradata1/ddevienne/oss/gperftools-2.1/.libs/lt-heap-checker_unittest 
24192
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from 
/oradata1/ddevienne/oss/gperftools-2.1/.libs/lt-heap-checker_unittest...done.
Attaching to program: 
/oradata1/ddevienne/oss/gperftools-2.1/.libs/lt-heap-checker_unittest, process 
24192
Reading symbols from 
/oradata1/ddevienne/oss/gperftools-2.1/.libs/libtcmalloc.so.4...done.
Loaded symbols for /oradata1/ddevienne/oss/gperftools-2.1/.libs/libtcmalloc.so.4
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols 
found)...done.
[New LWP 24222]
[New LWP 24221]
[New LWP 24220]
[New LWP 24219]
[New LWP 24218]
[New LWP 24217]
[New LWP 24216]
[New LWP 24215]
[New LWP 24214]
[New LWP 24213]
[New LWP 24212]
[New LWP 24211]
[New LWP 24210]
[New LWP 24209]
[New LWP 24208]
[New LWP 24207]
[New LWP 24206]
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /oradata1/ddevienne/oss/lib/libunwind.so.7...done.
Loaded symbols for /oradata1/ddevienne/oss/lib/libunwind.so.7
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_compat.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libnss_compat.so.2
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libnss_nis.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libnss_nis.so.2
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libnss_files.so.2
0x0000003e5fc0e054 in __lll_lock_wait () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6.x86_64 
libgcc-4.4.7-3.el6.x86_64 libstdc++-4.4.7-3.el6.x86_64
(gdb)
(gdb)
(gdb) bt
#0  0x0000003e5fc0e054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003e5fc09388 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003e5fc09257 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fe21770dde8 in get_rs_cache (c=0x7fff79791970) at dwarf/Gparser.c:496
#4  _ULx86_64_dwarf_find_save_locs (c=0x7fff79791970) at dwarf/Gparser.c:814
#5  0x00007fe21770e3f9 in _ULx86_64_dwarf_step (c=0x7fff79791970) at 
dwarf/Gstep.c:35
#6  0x00007fe21771059a in _ULx86_64_step (cursor=<value optimized out>) at 
x86_64/Gstep.c:42
#7  0x00007fe21796f551 in GetStackTrace (result=0x7fff79792160, max_depth=42, 
skip_count=<value optimized out>)
    at src/stacktrace_libunwind-inl.h:96
#8  0x00007fe217964aa8 in MallocHook_GetCallerStackTrace 
(result=0x7fff79792428, max_depth=1, skip_count=<value optimized out>)
    at src/malloc_hook.cc:665
#9  0x00007fe21796cf91 in GetCallerStackTrace (start=<value optimized out>, 
size=131072) at ./src/gperftools/malloc_hook.h:262
#10 MemoryRegionMap::RecordRegionAddition (start=<value optimized out>, 
size=131072) at src/memory_region_map.cc:459
#11 0x00007fe217964223 in MallocHook::InvokeMmapHookSlow 
(result=0x7fe20c489000, start=0x0, size=131072, protection=3, flags=34, fd=-1, 
offset=0)
    at src/malloc_hook.cc:549
#12 0x00007fe217973983 in InvokeMmapHook (start=0x0, length=131072, prot=3, 
flags=34, fd=-1, offset=0) at src/malloc_hook-inl.h:219
#13 mmap (start=0x0, length=131072, prot=3, flags=34, fd=-1, offset=0) at 
src/malloc_hook_mmap_linux.h:177
#14 0x00007fe2179674dd in DoAllocWithArena (request=<value optimized out>, 
arena=0x7fe21792b180) at src/base/low_level_alloc.cc:466
#15 0x00007fe217973e2e in LowLevelAlloc::AllocWithArena (request=65560, 
arena=0x7fe21792b180) at src/base/low_level_alloc.cc:511
#16 0x00007fe21795ba93 in HeapLeakChecker::Allocator::Allocate (n=<value 
optimized out>) at src/heap-checker.cc:310
#17 0x00007fe21796a325 in 
New<AddressMap<HeapProfileTable::AllocValue>::Cluster> (this=0x7fe20c4c89b8, 
key=0xf4c000, value=...)
    at src/addressmap-inl.h:258
#18 FindCluster (this=0x7fe20c4c89b8, key=0xf4c000, value=...) at 
src/addressmap-inl.h:225
#19 AddressMap<HeapProfileTable::AllocValue>::Insert (this=0x7fe20c4c89b8, 
key=0xf4c000, value=...) at src/addressmap-inl.h:310
#20 0x00007fe217968cfa in Add (ptr=<value optimized out>, v=<value optimized 
out>, snapshot=0x7fe20c4c8980) at src/heap-profile-table.h:408
#21 HeapProfileTable::AddToSnapshot (ptr=<value optimized out>, v=<value 
optimized out>, snapshot=0x7fe20c4c8980)
    at src/heap-profile-table.cc:564
#22 0x00007fe217968d98 in Iterate<HeapProfileTable::Snapshot*> (this=<value 
optimized out>) at src/addressmap-inl.h:414
#23 HeapProfileTable::TakeSnapshot (this=<value optimized out>) at 
src/heap-profile-table.cc:552
#24 0x00007fe2179590d6 in HeapLeakChecker::Create (this=0x7fff79792980, 
name=0x409641 "trivial", make_start_snapshot=true)
    at src/heap-checker.cc:1593
#25 0x00000000004072f3 in TestHeapLeakChecker (argc=<value optimized out>, 
argv=<value optimized out>) at src/tests/heap-checker_unittest.cc:579
#26 main (argc=<value optimized out>, argv=<value optimized out>) at 
src/tests/heap-checker_unittest.cc:1456
(gdb)

Red Hat Enterprise Linux Workstation release 6.4 (Santiago)
Linux ... 2.6.32-358.el6.x86_64 #1 SMP Tue Jan 29 11:47:41 EST 2013 x86_64 
x86_64 x86_64 GNU/Linux

Note sure what else to provide. Quite likely a user error.

Original issue reported on code.google.com by ddevie...@gmail.com on 1 Oct 2013 at 11:15

GoogleCodeExporter commented 9 years ago
I will need backtraces of all threads.

Also, please note that, RHEL6 issue was discovered in 2.1 and is fixed already 
on master.

Original comment by alkondratenko on 1 Oct 2013 at 5:25

GoogleCodeExporter commented 9 years ago
Here's the full backtrace for all threads. Let me know if you need anything 
else. Thanks, --DD

Original comment by ddevie...@gmail.com on 2 Oct 2013 at 1:03

Attachments:

GoogleCodeExporter commented 9 years ago
I need to think about your case a bit more (and can't do in now in normal work 
time). But there's great chance that treatment proposed in 
https://groups.google.com/forum/#!msg/google-perftools/PQR4zQ6CQYs/53RAJe-leW8J 
will help you.

Original comment by alkondratenko on 2 Oct 2013 at 8:18

GoogleCodeExporter commented 9 years ago
Indeed.

Thread 1 took memory region map (as part of initialization of heap checker) and 
then while holding this lock it did allocation via low level alloc that called 
mmap, that our mmap recorder intercepted and called libunwind to capture 
backtrace. Libunwind tries to grab it's internal mutex.

Meanwhile thread 10 was recording allocation and invoked libunwind to capture 
backtrace. That took libunwind lock. But libunwind internally did mmap which we 
tried to record and as part of that take memory region map.

One of my commits above fixes exactly that lack of ordering between memory 
region map lock and libunwind lock.

I think I'll merge it today.

Thanks for raising it and being quick with providing all necessary evidence.

Original comment by alkondratenko on 12 Oct 2013 at 10:20

GoogleCodeExporter commented 9 years ago
Merged 
https://code.google.com/p/gperftools/source/detail?r=db0d5730ee059d72b895fbead52
37f9cb5bbf98a

Thanks

Original comment by alkondratenko on 12 Oct 2013 at 11:20