mzhaom / gperftools

Fast, multi-threaded malloc() and nifty performance analysis tools
https://code.google.com/p/gperftools/
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

heapleakchecker test never finishes (inf. loop or deadlock) #116

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Build perftools 1.1 with latest libunwind-0.99-beta
2. ./configure; make; make check

What is the expected output? What do you see instead?

===============================================================

HeapLeak checker deadlock (infinite loop?) info:

Testing ./heap-checker_unittest with HEAPCHECK=local ... 
<runs forever, memory keeps growing out of bounds>

Child stack:
(gdb) bt
#0  0x0000003bd910ba81 in nanosleep () from /lib64/libpthread.so.0
#1  0x00002aaaaaadec7f in SpinLock::SlowLock (this=0x2aaaaac0fc40)
    at src/base/spinlock.cc:99
#2  0x00002aaaaaac6636 in DeleteHook (ptr=0x798080) at src/base/spinlock.h:72
#3  0x00002aaaaaae06dc in free (ptr=0x798080) at src/malloc_hook-inl.h:107
#4  0x00002aaaab0aa6cd in ?? () from /lib64/libnss_ldap.so.2
#5  0x00002aaaab0aa738 in ?? () from /lib64/libnss_ldap.so.2
#6  0x00002aaaab0a78cc in ?? () from /lib64/libnss_ldap.so.2
#7  0x00002aaaab0a0696 in ?? () from /lib64/libnss_ldap.so.2
#8  0x00002aaaab093a58 in ?? () from /lib64/libnss_ldap.so.2
#9  0x00002aaaab093c31 in ?? () from /lib64/libnss_ldap.so.2
#10 0x0000003bd8892797 in fork () from /lib64/libc.so.6
#11 0x00002aaaaaad879d in HeapProfileTable::Snapshot::ReportLeaks (
    this=0x2aaaab5e4640, checker_name=<value optimized out>, 
    filename=0x2aaaab5ecf40
"/tmp/lt-heap-checker_unittest.22298.trick-end.heap")
    at src/heap-profile-table.cc:549
#12 0x00002aaaaaacbddd in HeapLeakChecker::DoNoLeaks (this=0x7ffffffce950, 
    check_type=<value optimized out>, fullness=<value optimized out>, 
    report_mode=<value optimized out>) at src/heap-checker.cc:1712
#13 0x000000000040bcd8 in HeapLeakChecker::BriefNoLeaks (this=0x7ffffffcd880)
    at ./src/google/heap-checker.h:148
#14 0x000000000040387b in RunSilent (check=<value optimized out>, 
    func=0x40bcc0 <HeapLeakChecker::BriefNoLeaks()>)
    at src/tests/heap-checker_unittest.cc:396
#15 0x0000000000405854 in VerifyLeaks (check=0x7ffffffce950, 
    type=<value optimized out>, leaked_bytes=1600, leaked_objects=2)
    at src/tests/heap-checker_unittest.cc:410
#16 0x0000000000405e7a in TestLeakButTotalsMatch ()
    at src/tests/heap-checker_unittest.cc:608
#17 0x000000000040a9a5 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at src/tests/heap-checker_unittest.cc:1381
(gdb) q

Parent is just doing a read() from the child socket.

Note that CPU is also being burned by a bunch of threads in a small amount
each (2% x 10+ threads = 25% of 1 cpu)

What version of the product are you using? On what operating system?

perftools 1.1, libunwind-0.99-beta, linux x86-64

Please provide any additional information below.

Original issue reported on code.google.com by mrab...@gmail.com on 29 Mar 2009 at 7:15

GoogleCodeExporter commented 9 years ago
The problem seems to be that execve can do malloc operations, so when you fork, 
bot
the parent process and the child process (which is trying to run execve) can 
fight
for the lock used by the heap-checker.  The fix is to turn off heap-checking in 
the
child.  I will have that be a part of the next perftools release; it should fix 
this
problem.

Original comment by csilv...@gmail.com on 30 Mar 2009 at 11:21

GoogleCodeExporter commented 9 years ago
In my setup (with my libc), you have to turn off the various hooks in the parent
_before_ the fork(); it's not sufficient to turn off in the child after the 
fork()
but before the exec().

The fork() itself causes various allocations via the nss_ldap lib (most likely
through getpwuid() or similar). 

Original comment by mrab...@gmail.com on 31 Mar 2009 at 6:18

GoogleCodeExporter commented 9 years ago
Ugh, that's too bad.  Thanks for the heads up; I'll make the relevant changes.

Original comment by csilv...@gmail.com on 31 Mar 2009 at 6:20

GoogleCodeExporter commented 9 years ago
This is fixed in perftools 1.2, just released.

Original comment by csilv...@gmail.com on 18 Apr 2009 at 12:15

GoogleCodeExporter commented 9 years ago
I still get a freeze with 1.2 with the heap leak checker (freezes on the
HEAPCHECK=normal test).  The attached patch fixes the problem for me in 1.2.

Original comment by mrab...@gmail.com on 11 May 2009 at 11:40

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by csilv...@gmail.com on 12 May 2009 at 4:14

GoogleCodeExporter commented 9 years ago
Doh! -- I see the problem: when I cancel the hooks (in heap-checker.cc), I 
neglected
to cancel the Delete hook.  In fact, I didn't cancel the new hook correctly 
either. 
I'm a bit surprised folks don't see a crash there.  I think it's because I only
assert() correctness, and most folks don't run with assertions on (maybe?)

I'll fix this for the next release -- hopefully for real, this time!

(btw, the patch you have isn't safe -- users can trigger a leak-check at any 
time,
and if they trigger one while other threads are running, then disabling the 
hooks at
symbolize time, like you do, will mess up the data-collecting on all other 
threads. 
For that reason, I ended up with a fix that only disables the hooks for the
end-of-program (atexit) leak-check.  My mistake was not disabling all the 
hooks.)

Original comment by csilv...@gmail.com on 12 May 2009 at 4:27

GoogleCodeExporter commented 9 years ago
Can you try the following patch, and see if it fixes the problem for you?  If 
so,
I'll make it part of the next release.

--- /tmp/tmp.20891.15   2009-05-12 16:08:41.000000000 -0700                     
+++ /home/csilvers/opensource/google-perftools/src/heap-checker.cc      2009-05\
-12 16:06:21.713200000 -0700                                                    
@@ -1724,6 +1724,10 @@
       // typically only want to report once in a program's run, at the
       // very end.
       CancelInitialMallocHooks();
+      if (MallocHook::GetNewHook() == NewHook)                                 
+        MallocHook::SetNewHook(NULL);                                          
+      if (MallocHook::GetDeleteHook() == DeleteHook)                           
+        MallocHook::SetDeleteHook(NULL);                                       
       have_disabled_hooks_for_symbolize = true;
       leaks->ReportLeaks(name_, pprof_file, true);  // true = should_symbolize
     } else {

Original comment by csilv...@gmail.com on 12 May 2009 at 11:09

GoogleCodeExporter commented 9 years ago
This patch fixes my problem.  All unittests and my test programs pass.

Original comment by mrab...@gmail.com on 13 May 2009 at 12:12

GoogleCodeExporter commented 9 years ago
This should be fixed in perftools 1.3, just released.

Original comment by csilv...@gmail.com on 10 Jun 2009 at 2:00