luftreich / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

tcmalloc v0.98 (and v1.2) unittest failed on linux #64

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
  1. Just "make check"

What is the expected output? What do you see instead?
  ======================================
  2 of 22 tests failed
  Please report to opensource@google.com
  ======================================

What version of the product are you using? On what operating system?
  google-perftools-0.98.tar.gz
  Linux 2.6.16.46-0.12-bigsmp #1 SMP
  gcc (GCC) 4.1.2 20070115 (prerelease) (SUSE Linux)

Please provide any additional information below.
  see attachments.

Original issue reported on code.google.com by lxu4...@gmail.com on 25 Jun 2008 at 1:38

Attachments:

GoogleCodeExporter commented 9 years ago
  I'm not on x86-64 linux. And I try to do "./configure --enable-frame-pointer &&
make && make check". The result isn't changed.

Original comment by lxu4...@gmail.com on 25 Jun 2008 at 1:49

GoogleCodeExporter commented 9 years ago
Hmm, unfortunately the core file isn't so useful without the binaries that 
you're
using.  I don't know what might be causing the segfaults in the heap-checker.  I
think you'll need to do all the legwork on your side, I'm afraid.  Try running 
gdb on
the binary (and the core file), to get some idea where the code is crashing.  
You may
find it easier to do './configure CXXFLAGS=-g' to turn off optimization.  If 
you can
find out where it's crashing -- get a backtrace, print some relevant values, 
etc --
it may become obvious what's going wrong and how to fix it.  Otherwise, I'm 
afraid
there's no much we can do here.

Original comment by csilv...@gmail.com on 25 Jun 2008 at 1:54

GoogleCodeExporter commented 9 years ago
Core was generated by 
`/home/lxu/google-perftools-0.98/.libs/lt-heap-checker_unittest'.
Program terminated with signal 11, Segmentation fault.

#0  0xb7dd11d7 in ListAllProcessThreads (parameter=0x0, callback=0xb7dc60e4
<HeapLeakChecker::IgnoreLiveThreadsLocked(void*, int, int*, char*)>)
    at src/base/linuxthreads.c:650
#1  0xb7dc6709 in HeapLeakChecker::IgnoreAllLiveObjectsLocked
(self_stack_top=0xbfa5f964) at src/heap-checker.cc:1118
#2  0xb7dc6989 in HeapLeakChecker::DumpProfileLocked (this=0x80c33e0,
profile_type=HeapLeakChecker::START_PROFILE, self_stack_top=0xbfa5f964,
    alloc_bytes=0x80c33e8, alloc_objects=0x80c33ec) at src/heap-checker.cc:1439
#3  0xb7dc7a52 in HeapLeakChecker::Create (this=0x80c33e0, name=0xb7dd9591 
"_main_")
at src/heap-checker.cc:1477
#4  0xb7dc7bc8 in HeapLeakChecker (this=0x80c33e0) at src/heap-checker.cc:1499
#5  0xb7dc848c in HeapLeakChecker::InternalInitStart () at 
src/heap-checker.cc:2082
#6  0xb7dc858f in google_init_module_init_start () at src/heap-checker.cc:2099
#7  0x0805117f in GoogleInitializer (this=0xb7dfa650, name=0xb7dd7b79 
"init_start",
f=0xb7dc8578 <google_init_module_init_start>)
    at ./src/base/googleinit.h:40
#8  0xb7dc1b8b in __static_initialization_and_destruction_0 (__initialize_p=1,
__priority=65535) at src/heap-checker.cc:2099
#9  0xb7dc1bc9 in global constructors keyed to
_ZN62FLAG__namespace_do_not_use_directly_use_DECLARE_string_instead16FLAGS_heap_
checkE ()
    at src/heap-checker.cc:2483
#10 0xb7dd5375 in __do_global_ctors_aux () from
/home/lxu/google-perftools-0.98/.libs/libtcmalloc.so.0
#11 0xb7daa7ad in _init () from 
/home/lxu/google-perftools-0.98/.libs/libtcmalloc.so.0
#12 0xb7efd7d3 in call_init () from /lib/ld-linux.so.2
#13 0xb7efd8e3 in _dl_init_internal () from /lib/ld-linux.so.2
#14 0xb7ef087f in _dl_start_user () from /lib/ld-linux.so.2

Original comment by lxu4...@gmail.com on 25 Jun 2008 at 2:36

GoogleCodeExporter commented 9 years ago
It looks like line 650 is assigning to errno.  Does that match what
you have in your version of the source code?

That's tough.  It looks like what's happening is this code is running
before libc has had a chance to set up errno (perhaps before it's had
a chance to set up threads?)  That's a little confusing, but it's the
only explanation that makes sense for why it would crash on that line
in particular.  Maybe you can go through in gdb and make sure it's
actually crashing on the errno assignment.  Disassembling the code at
that point may be helpful as well, to make sure it's errno that's not
set properly.

Maybe it's some bug in gcc (you say you're using a prerelease version of gcc 
4.1.2?)
 If you can, try a different gcc and see if you still get a crash.  I don't think the
leak-checker is doing anything wrong, though.

Original comment by csilv...@gmail.com on 25 Jun 2008 at 7:34

GoogleCodeExporter commented 9 years ago
I had run unit test at "Red Hat Enterprise Linux AS release 3 (Taroon)". 
There is only 1 fail and without core file.

env :
  Linux dev-x 2.4.21-4.ELsmp #1 SMP
  gcc (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-20)
  glibc-2.3.2-95.3

result :
  Testing ./heap-checker_unittest with HEAPCHECK= ...
  FAIL
  Test was taking unexpectedly long time to run and so we aborted it.
  Try the test case manually or raise the timeout from 120
  to distinguish test slowness from a real problem.
  Output from failed run:
  ---
  ---
  FAIL: heap-checker-death_unittest.sh

Original comment by lxu4...@gmail.com on 25 Jun 2008 at 8:34

GoogleCodeExporter commented 9 years ago
Yes, I'm not so worried about that error.  Sometimes the unittest does, indeed, 
take
a long time.  You could try doing as it suggests (run the test case manually) 
to make
sure everything is actually ok.  I've not been able to test very much on RHEL, 
and I
know they do some weird things like putting fences around their stacks, so it's
conceivable there's a problem somewhere.  But I don't think this error message, 
in
particular, is particularly indicative of that.

Original comment by csilv...@gmail.com on 25 Jun 2008 at 8:40

GoogleCodeExporter commented 9 years ago
Have you had a chance to look into this any more?  Are you still seeing the 
crashes
you've described?  I'd like to help resolve the problems here, but I'm not sure 
how
much I can do with the information you have so far; I can't reproduce these 
problems
on the machines I have access to.  But if you have any more discoveries you've 
made,
I'm happy to look at this some more!

Original comment by csilv...@gmail.com on 15 Jul 2008 at 6:04

GoogleCodeExporter commented 9 years ago
It's been over half a year with no feedback from the original poster, so I'm 
closing
this bug.  Hopefully the issue was magically fixed since v0.98, but if anyone 
sees
this problem again, and has more data to help track it down, feel free to 
reopen the bug.

Original comment by csilv...@gmail.com on 6 Mar 2009 at 5:58

GoogleCodeExporter commented 9 years ago
I continue to encounter this failure (with 1.12).  While the OP encountered it 
on
RHEL, I see this on RH9.

Testing ./heap-checker_unittest with HEAPCHECK=strict ... OK
PASS
PASS: heap-checker_unittest.sh
Testing ./heap-checker_unittest with HEAPCHECK= ... FAIL
Test was taking unexpectedly long time to run and so we aborted it.
Try the test case manually or raise the timeout from 6000
to distinguish test slowness from a real problem.
Output from failed run:

Note that I change the timeout to 6000 from 120 and the test still fails.  When
the test was in progress, I did an strace on the lt-heap-checker_unittest 
process
and noticed this:

$ strace -p 2086
nanosleep({0, 2000001}, NULL)           = 0
nanosleep({0, 2000001}, NULL)           = 0
nanosleep({0, 2000001}, NULL)           = 0
nanosleep({0, 2000001}, NULL)           = 0
nanosleep({0, 2000001}, NULL)           = 0
nanosleep({0, 2000001}, NULL)           = 0
nanosleep({0, 2000001}, NULL)           = 0
...
...
I'll post some more information again later.

Original comment by app...@gmail.com on 11 May 2009 at 12:06

GoogleCodeExporter commented 9 years ago
Thanks for the report!  This strace indicates that thread is waiting on a lock. 
 I'm
not sure why though.  My guess is it's because some library routine that we call
during heap-checking allocates memory, which results in a recursive call to the
heap-checker code.

A good next step would be to connect to this process via gdb, if you can.  Then 
you
can use 'bt' to see where the thread is.  Use 'info threads' to list all 
threads,
'thread i' to switch to each thread, and 'bt' to see what each thread is doing.

I actually test on an RH9 machine, and the heap-checker passes fine there, so 
I'm not
exactly sure what might be going on!

Original comment by csilv...@gmail.com on 11 May 2009 at 3:21

GoogleCodeExporter commented 9 years ago
OK, some additional information:

This GDB was configured as "i686-pc-linux-gnu".
(gdb) attach 4571
Attaching to process 4571
Reading symbols from
/u/giridhar/junk/google-perftools-1.2/.libs/lt-heap-checker_unittest...done.
Using host libthread_db library "/lib/libthread_db.so.1".
Reading symbols from 
/u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0...done.
Loaded symbols for /u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0
Reading symbols from /usr/lib/libstdc++.so.5...done.
Loaded symbols for /usr/lib/libstdc++.so.5
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 4571)]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
0x4016c225 in nanosleep () from /lib/libpthread.so.0
(gdb) bt
#0  0x4016c225 in nanosleep () from /lib/libpthread.so.0
#1  0x400404d9 in SpinLock::SlowLock (this=0x40050eec) at 
src/base/spinlock.cc:99
#2  0x4002a0b3 in NewHook (ptr=0x80dd000, size=704) at spinlock.h:72
#3  0x40041cc7 in operator new (size=704) at src/tcmalloc.cc:1046
#4  0x400fb831 in std::__default_alloc_template<true, 0>::_S_chunk_alloc () from
/usr/lib/libstdc++.so.5
#5  0x400fb73d in std::__default_alloc_template<true, 0>::_S_refill () from
/usr/lib/libstdc++.so.5
#6  0x400fb2ac in std::__default_alloc_template<true, 0>::allocate () from
/usr/lib/libstdc++.so.5
#7  0x401012a8 in std::string::_Rep::_S_create () from /usr/lib/libstdc++.so.5
#8  0x400fdacd in std::string::_M_mutate () from /usr/lib/libstdc++.so.5
#9  0x40030d3b in std::string::_M_replace_safe<char const*> (this=0x805574c,
__k1=0x40044a1a "", __k2=0x40044a1a "") at basic_string.tcc:533
#10 0x4002f2da in HeapLeakChecker::TurnItselfOffLocked () at basic_string.h:242
#11 0x4002e8d5 in HeapLeakChecker::InternalInitStart () at 
src/heap-checker.cc:1835
#12 0x4002e9ea in google_init_module_init_start () at src/heap-checker.cc:1989
#13 0x4002fa1b in __static_initialization_and_destruction_0 (__initialize_p=0,
__priority=65535) at googleinit.h:40
#14 0x4002ffd6 in global constructors keyed to
_ZN62FLAG__namespace_do_not_use_directly_use_DECLARE_string_instead16FLAGS_heap_
checkE ()
    at addressmap-inl.h:205
#15 0x40041a01 in __do_global_ctors_aux () at malloc_hook-inl.h:179
#16 0x40024331 in _init () from
/u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0
#17 0x4000abe7 in _dl_init_internal () from /lib/ld-linux.so.2
#18 0x40000abd in _dl_start_user () from /lib/ld-linux.so.2
(gdb) thread apply all where

Thread 1 (Thread 16384 (LWP 4571)):
#0  0x4016c225 in nanosleep () from /lib/libpthread.so.0
#1  0x400404d9 in SpinLock::SlowLock (this=0x40050eec) at 
src/base/spinlock.cc:99
#2  0x4002a0b3 in NewHook (ptr=0x80dd000, size=704) at spinlock.h:72
#3  0x40041cc7 in operator new (size=704) at src/tcmalloc.cc:1046
#4  0x400fb831 in std::__default_alloc_template<true, 0>::_S_chunk_alloc () from
/usr/lib/libstdc++.so.5
#5  0x400fb73d in std::__default_alloc_template<true, 0>::_S_refill () from
/usr/lib/libstdc++.so.5
#6  0x400fb2ac in std::__default_alloc_template<true, 0>::allocate () from
/usr/lib/libstdc++.so.5
#7  0x401012a8 in std::string::_Rep::_S_create () from /usr/lib/libstdc++.so.5
#8  0x400fdacd in std::string::_M_mutate () from /usr/lib/libstdc++.so.5
#9  0x40030d3b in std::string::_M_replace_safe<char const*> (this=0x805574c,
__k1=0x40044a1a "", __k2=0x40044a1a "") at basic_string.tcc:533
#10 0x4002f2da in HeapLeakChecker::TurnItselfOffLocked () at basic_string.h:242
#11 0x4002e8d5 in HeapLeakChecker::InternalInitStart () at 
src/heap-checker.cc:1835
#12 0x4002e9ea in google_init_module_init_start () at src/heap-checker.cc:1989
#13 0x4002fa1b in __static_initialization_and_destruction_0 (__initialize_p=0,
__priority=65535) at googleinit.h:40
#14 0x4002ffd6 in global constructors keyed to
_ZN62FLAG__namespace_do_not_use_directly_use_DECLARE_string_instead16FLAGS_heap_
checkE ()
    at addressmap-inl.h:205
#15 0x40041a01 in __do_global_ctors_aux () at malloc_hook-inl.h:179
#16 0x40024331 in _init () from
/u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0
#17 0x4000abe7 in _dl_init_internal () from /lib/ld-linux.so.2
#18 0x40000abd in _dl_start_user () from /lib/ld-linux.so.2
#0  0x4016c225 in nanosleep () from /lib/libpthread.so.0
(gdb)        
(gdb) 
(gdb) info threads
  1 Thread 16384 (LWP 4571)  0x4016c225 in nanosleep () from /lib/libpthread.so.0
(gdb) 

There is just one thread.

Original comment by app...@gmail.com on 13 May 2009 at 12:04

GoogleCodeExporter commented 9 years ago
(gdb) f 1
#1  0x400404d9 in SpinLock::SlowLock (this=0x40050eec) at 
src/base/spinlock.cc:99
99      nanosleep(&tm, NULL);
(gdb) l
94  
95      // Sleep for a few milliseconds
96      struct timespec tm;
97      tm.tv_sec = 0;
98      tm.tv_nsec = 2000001;
99      nanosleep(&tm, NULL);
100   }
101   errno = saved_errno;
102 }
(gdb) l SpinLock::SlowLock
67  // Hook into global constructor execution:

[snip...]

81    if (lockword_ == 1) {
82      sched_yield();          // Spinning failed. Let's try to be gentle.
83    }
84  
85    while (Acquire_CompareAndSwap(&lockword_, 0, 1) != 0) {
86      // This code was adapted from the ptmalloc2 implementation of
(gdb) 
87      // spinlocks which would sched_yield() upto 50 times before
[snip...]
98      tm.tv_nsec = 2000001;
99      nanosleep(&tm, NULL);
100   }
101   errno = saved_errno;
102 }

I don't know if we are looping in here, or entering this function again because 
of
some recursion.

(gdb) p lockword_
$1 = 1
(gdb) p AtomicOps_Internalx86CPUFeatures.has_amd_lock_mb_bug
$2 = false
(gdb) c
Continuing.

I find myself at the same location even if I try and break in multiple times.

115 inline Atomic32 Acquire_CompareAndSwap(volatile Atomic32* ptr,
116                                        Atomic32 old_value,
117                                        Atomic32 new_value) {
118   Atomic32 x = NoBarrier_CompareAndSwap(ptr, old_value, new_value);
119   if (AtomicOps_Internalx86CPUFeatures.has_amd_lock_mb_bug) {
120     __asm__ __volatile__("lfence" : : : "memory");
121   }
122   return x;
123 }

Original comment by app...@gmail.com on 13 May 2009 at 12:17

GoogleCodeExporter commented 9 years ago
Ok, I've figured it out!  The problem is that, on your machine, this line in
heap_checker.cc allocates memory:
     FLAGS_heap_check = "";  // for users who test for it                          

Try replacing it with
   FLAGS_heap_check.clear();  // for users who test for it                          

See if this fixes the problem for you.  If not, try replacing it with this 
instead:
   if (!FLAGS_heap_check.empty())
     FLAGS_heap_check.clear();

Let me know what works, and I'll put it in the next release.

craig

Original comment by csilv...@gmail.com on 13 May 2009 at 3:32

GoogleCodeExporter commented 9 years ago
Clear()ing the heap checker flags if they are not empty works.  Thank you.  
However,
further tests fail.  Log included.

PASS
PASS: heap-checker_unittest.sh
Testing ./heap-checker_unittest with HEAPCHECK= ... PASS
Testing ./heap-checker_unittest with HEAP_CHECKER_TEST_NO_THREADS=1 ... FAIL
Wrong exit code: expected: '0'; actual: 134
Output did not match '^PASS$'
Output from failed run:
---
WARNING: Perftools heap leak checker is active -- Performance may suffer

Adding pthread-specifics for thread 16384 pid 1746
Adding pthread-specifics for thread 16384 pid 1746
In main(): heap_check=strict
No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 191 reachable heap objects of 21219 bytes
No leaks found for check "trivial" (but no 100% guarantee that there aren't 
any):
found 634 reachable heap objects of 39300 bytes
No leaks found for check "simple" (but no 100% guarantee that there aren't any):
found 634 reachable heap objects of 39299 bytes

Pre leaking : 0xf834377b ^ 0xf03a5f7b
[snip...]
Leaking : 0xf8355f7b ^ 0xf03a5f7b
No leaks found for check "death_noleaks" (but no 100% guarantee that there 
aren't
any): found 637 reachable heap objects of 46434 bytes

Pre leaking : 0xf835ef7b ^ 0xf03a5f7b
[snip...]
Leaking : 0xf835ef7b ^ 0xf03a5f7b
Leak check _main_ detected leaks of 3104 bytes in 1 objects
The 1 largest leaks:
Leak of 3104 bytes in 1 objects allocated from:
    @ 0x400fa831 std::__default_alloc_template::_S_chunk_alloc
    @ 0x400fa73d std::__default_alloc_template::_S_refill
    @ 0x400fa2ac std::__default_alloc_template::allocate
    @ 0x401002a8 std::basic_string::_Rep::_S_create
    @ 0x401010c5 std::basic_string::_M_replace_safe
    @ 0x400fd374 std::basic_string::basic_string[in-charge]
    @ 0x4002d09d SuggestPprofCommand
    @ 0x4002d5c7 HeapLeakChecker::DoNoLeaks
    @ 0x80516d8 HeapLeakChecker::BriefNoLeaks
    @ 0x804a9c5 RunSilent
    @ 0x804aa1c VerifyLeaks
    @ 0x804b380 TestLeakButTotalsMatch
    @ 0x804e12e main
    @ 0x4

[snip...]

Check failed: HeapLeakChecker::NoGlobalLeaks()
Must not call heap leak checker manually after  program-exit's automatic check.
---
FAIL: heap-checker-death_unittest.sh
PASS
PASS: getpc_test
Running OpsWhenStopped
Running StartStopEmpty
PROFILE: interrupts/evictions/bytes = 0/0/32
Running StartWhenStarted
PROFILE: interrupts/evictions/bytes = 0/0/32
Running StartStopEmpty2
PROFILE: interrupts/evictions/bytes = 0/0/32
Running CollectOne
PROFILE: interrupts/evictions/bytes = 1/0/60
Running CollectTwoMatching
PROFILE: interrupts/evictions/bytes = 2/0/60
Running CollectTwoFlush
PROFILE: interrupts/evictions/bytes = 2/0/88
Running StartResetRestart
PROFILE: interrupts/evictions/bytes = 0/0/32
PASS
PASS: profiledata_unittest
threads have separate timers
Running RegisterUnregisterCallback
Running MultipleCallbacks
Running Reset
Running RegisterCallbackBeforeThread
Done
PASS: profile_handler_unittest

Craig, apologies for not spending enough time digging into the cause of these 
issues
myself.

Additionally, the profiler unit-tests fail too.  All of them crash and the 
backtrace
looks like this:

$ gdb ./.libs/lt-profiler4_unittest core.6150 
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.

[snip...]

Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x4001bf86 in base::VDSOSupport::ElfMemImage::Init(void const*) 
(this=0xbfffdf34,
base=0xffffffff) at src/base/vdso_support.cc:212
212   if (memcmp(base, ELFMAG, SELFMAG)) {
(gdb) bt
#0  0x4001bf86 in base::VDSOSupport::ElfMemImage::Init(void const*) 
(this=0xbfffdf34,
base=0xffffffff) at src/base/vdso_support.cc:212
#1  0x4001bace in ElfMemImage (this=0xbfffdf34, base=0xffffffff) at
src/base/vdso_support.cc:117
#2  0x4001c297 in VDSOSupport (this=0xbfffdf34) at src/base/vdso_support.cc:303
#3  0x4001b6d0 in NextStackFrame<true, true> (old_sp=0xbfffdf8c, uc=0xbfffe210) 
at
stacktrace_x86-inl.h:150
#4  0x4001b441 in GetStackTraceWithContext(void**, int, int, void const*)
(result=0xbfffdfb8, max_depth=63, skip_count=1, uc=0xbfffe210)
    at stacktrace_x86-inl.h:283
#5  0x4001906c in _r_debug () from
/u/giridhar/junk/google-perftools-1.2/.libs/libprofiler.so.0
#6  0x4001a788 in ProfileHandler::SignalHandler(int, siginfo*, void*) (sig=27,
sinfo=0xbfffe190, ucontext=0xbfffe210) at stl_list.h:138
#7  0x4012b6b8 in __pthread_sighandler_rt () from /lib/libpthread.so.0
#8  <signal handler called>
#9  0x401b65aa in vfprintf () from /lib/libc.so.6
#10 0x401d5bbe in vsnprintf () from /lib/libc.so.6
#11 0x401bf133 in snprintf () from /lib/libc.so.6
#12 0x08048bae in test_main_thread () at src/tests/profiler_unittest.cc:79
#13 0x08048c91 in main (argc=4, argv=0xbfffede4) at 
src/tests/profiler_unittest.cc:110
#14 0x4018662d in __libc_start_main () from /lib/libc.so.6
(gdb) f 0
#0  0x4001bf86 in base::VDSOSupport::ElfMemImage::Init(void const*) 
(this=0xbfffdf34,
base=0xffffffff) at src/base/vdso_support.cc:212
212   if (memcmp(base, ELFMAG, SELFMAG)) {
(gdb) p base
$1 = (const void *) 0xffffffff
(gdb) q

Original comment by app...@gmail.com on 14 May 2009 at 8:56

GoogleCodeExporter commented 9 years ago
Just to make sure I understand: you had to change the code to:
    if (!FLAGS_foo.empty()) FLAGS_foo.clear()
because this wasn't enough to solve the problem:
    FLAGS_foo.clear();     // replacing FLAGS_foo = "";

Is that right?  I want to make sure I put in the right patch.

} Leak check _main_ detected leaks of 3104 bytes in 1 objects
}   @ 0x4002d09d SuggestPprofCommand

Hmm, this is strange.  SuggestPprofCommand is pretty clearly, by inspection, 
unable
to leak any data.  It would be interesting to see what is at 0x4002d09d.  Can 
you
do something like
   addr2line .libs/lt-heap-checker_unittest 0x4002d09d
and see if it gives you a line number?  (I'm not sure of the exact addr2line 
syntax,
or the exact filename, but it's something like this.)

} Additionally, the profiler unit-tests fail too.

Hmm, looks like VDSO is not reliable on redhat 9.  That should be 
straightforward --
I'll ask the VDSO expert here.  In the meantime, try the following change to 
see if
it fixes these tests for you: In the file src/base/vdso_support.h, change the 
line
   #define HAVE_VDSO_SUPPORT 1
to
   #define HAVE_VDSO_SUPPORT 0

Original comment by csilv...@gmail.com on 14 May 2009 at 1:19

GoogleCodeExporter commented 9 years ago
The VDSO/profiler crash problem is due to "GCC too old" (RH9 shipped with "gcc 
(GCC)
3.2.2 20030222 (Red Hat Linux 3.2.2-5)"), but code in VDSOSupport depends on
__attribute__((constructor)), which is not handled correctly untill gcc-3.3.2.

This can be observed by compiling:

/// --- cut ---
#include <stdio.h>

struct Foo {
   static void Init() __attribute__((constructor));
   static int init_called;
};

int Foo::init_called;
void Foo::Init() { init_called++; }

int main()
{
    if (!Foo::init_called) {
        printf("BUGGY GCC\n");
        return 1;
    }
    return 0;
}
/// --- cut ---

$ /usr/local/gcc-3.3/bin/g++ -g foo.cc && ./a.out && echo ok
BUGGY GCC
$ /usr/local/gcc-3.3.2/bin/g++ -g foo.cc && ./a.out && echo ok
ok

The following patch fixes this problem:

--- src/base/vdso_support.cc.orig       2009-03-05 11:30:47.000000000 -0800
+++ src/base/vdso_support.cc    2009-05-14 10:05:19.000000000 -0700
@@ -504,6 +504,12 @@ int GetCPU(void) {
   int ret_code = (*VDSOSupport::getcpu_fn_)(&cpu, NULL, NULL);
   return ret_code == 0 ? cpu : ret_code;
 }
-}

+#if (10000 * __GNUC__ + 100 * __GNUC_MINOR__ + __GNUC_PATCHLEVEL__) < 30302
+// GCC 3.3.1 or below do not have proper support for attribute((constructor))
+struct VDSOInitHelper { VDSOInitHelper() { VDSOSupport::Init(); } };
+static VDSOInitHelper vdso_init_helper;
+#endif
+
+}
 #endif  // HAVE_VDSO_SUPPORT

Original comment by ppluzhni...@gmail.com on 14 May 2009 at 5:14

GoogleCodeExporter commented 9 years ago
> Just to make sure I understand: you had to change the code to:
>    if (!FLAGS_foo.empty()) FLAGS_foo.clear()

Yes.

$ addr2line -e ./.libs/lt-heap-checker_unittest
0x400fa831
0x400fa73d
0x400fa2ac
0x401002a8
0x401010c5
0x400fd374
0x4002d09d
0x4002d5c7
0x80516d8
0x804a9c5
0x804aa1c
0x804b380
0x804e12e
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
src/google/heap-checker.h:130
src/tests/heap-checker_unittest.cc:397
src/tests/heap-checker_unittest.cc:408
src/tests/heap-checker_unittest.cc:610
src/tests/heap-checker_unittest.cc:1382

However, when I run the test individually

$ export HEAPCHECK=strict
$ ./heap-checker_unittest "" HEAP_CHECKER_TEST_NO_THREADS=1
WARNING: Perftools heap leak checker is active -- Performance may suffer

Adding pthread-specifics for thread 16384 pid 30516
Creating extra thread 1
A new HeapBusyThread 0
Adding pthread-specifics for thread 16386 pid 30533
Creating extra thread 2
A new HeapBusyThread 1

[snip...]

Adding pthread-specifics for thread 278546 pid 30549
Adding pthread-specifics for thread 16384 pid 30516
In main(): heap_check=strict
No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 1180 reachable heap objects of 127410 bytes
No leaks found for check "trivial" (but no 100% guarantee that there aren't 
any):
found 1623 reachable heap objects of 145588 bytes
No leaks found for check "simple" (but no 100% guarantee that there aren't any):
found 1624 reachable heap objects of 147267 bytes

Pre leaking : 0xf834077b ^ 0xf03a5f7b

[snip...]

Leaking : 0xf82b0f7b ^ 0xf03a5f7b
No leaks found for check "death_noleaks" (but no 100% guarantee that there 
aren't
any): found 1630 reachable heap objects of 159698 bytes

Pre leaking : 0xf8284f7b ^ 0xf03a5f7b

[snip...]
Leaking : 0xf8284f7b ^ 0xf03a5f7b
No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 1628 reachable heap objects of 161816 bytes
No leaks found for check "trivial_p" (but no 100% guarantee that there aren't 
any):
found 1630 reachable heap objects of 161830 bytes
No leaks found for check "simple_p" (but no 100% guarantee that there aren't 
any):
found 1631 reachable heap objects of 164525 bytes
No leaks found for check "disabling" (but no 100% guarantee that there aren't 
any):
found 1652 reachable heap objects of 165048 bytes
No leaks found for check "stl" (but no 100% guarantee that there aren't any): 
found
1659 reachable heap objects of 189258 bytes

Leaking : 0xf837e77f ^ 0xf03a5f7b
Expected leaks not found: Some liveness flood must be too optimistic
No leaks found for check "direct_stl-std::allocator<char>()" (but no 100% 
guarantee
that there aren't any): found 1660 reachable heap objects of 197184 bytes
No leaks found for check "direct_stl-std::allocator<int>()" (but no 100% 
guarantee
that there aren't any): found 1660 reachable heap objects of 197183 bytes

[snip...]

No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 1658 reachable heap objects of 197146 bytes
No leaks found for check "all" (but no 100% guarantee that there aren't any): 
found
1658 reachable heap objects of 197146 bytes
No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 1658 reachable heap objects of 197146 bytes
PASS
No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 1655 reachable heap objects of 197128 bytes.

Setting

   #define HAVE_VDSO_SUPPORT 0

Doesn't work.

Original comment by app...@gmail.com on 14 May 2009 at 5:33

GoogleCodeExporter commented 9 years ago
OK, I #undef-ed HAVE_VDSO_SUPPORT and the profiler tests go through.  
ppluzhnikov's
patch works too.

Original comment by app...@gmail.com on 14 May 2009 at 6:00

GoogleCodeExporter commented 9 years ago
Attached patch fixes the memory leak issue (not a correct fix, but I am using 
it as a
workaround).

--- orig/google-perftools-1.2/src/heap-checker.cc       2009-04-18 
02:47:48.000000000
+0530
+++ google-perftools-1.2/src/heap-checker.cc    2009-05-14 23:41:46.000000000 
+0530
@@ -1556,7 +1556,7 @@
 // for programs run on borg/mrtest/blaze.
 static void SuggestPprofCommand(const char* pprof_file_arg) {
   // Copy argument since we may mutate it later
-  string pprof_file = pprof_file_arg;
+  // string pprof_file = pprof_file_arg;

   // Extra help information to print for the user when the test is
   // being run in a way where the straightforward pprof command will
@@ -1588,7 +1588,7 @@
           fetch_cmd.c_str(),
           flags_heap_profile_pprof->c_str(),
           invocation_path().c_str(),
-          pprof_file.c_str(),
+          "",
           extra_help.c_str()
           );
 }

But one of the heap checker test fails.

[snip...]
Testing ./heap-checker_unittest with HEAPCHECK= HEAP_CHECKER_TEST_TEST_LEAK=1
HEAP_CHECKER_TEST_NO_THREADS=1 PERFTOOLS_VERBOSE=-2 ... PASS
Testing ./heap-checker_unittest with HEAP_CHECKER_TEST_TEST_LEAK=1
HEAP_CHECKER_TEST_NO_THREADS=1 ... FAIL
Wrong exit code: expected: '1'; actual: 11
Output did not match 'Exiting .* because of .* leaks$'
Output from failed run:
---
WARNING: Perftools heap leak checker is active -- Performance may suffer

Adding pthread-specifics for thread 16384 pid 15514
Adding pthread-specifics for thread 16384 pid 15514
In main(): heap_check=strict
No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 191 reachable heap objects of 21223 bytes

Leaking : 0xf834694b ^ 0xf03a5f7b
Leak check _main_ detected leaks of 40 bytes in 1 objects
The 1 largest leaks:
Leak of 40 bytes in 1 objects allocated from:
        @ 0x804a3a1 operator new[]
        @ 0x804a810 DoAllocHidden
        @ 0x8051b1b Callback2::Run
        @ 0x804a70d DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a665 DoRunHidden
        @ 0x804a7ec RunHidden
        @ 0x804a8c2 AllocHidden
        @ 0x804e6d5 main
        @ 

---
FAIL: heap-checker-death_unittest.sh
PASS
PASS: getpc_test
Running OpsWhenStopped
Running StartStopEmpty

[snip...]

Original comment by app...@gmail.com on 14 May 2009 at 6:24

GoogleCodeExporter commented 9 years ago
I think your workaround may actually be usable, since we never ended up 
modifying
pprof_file.  My one suggestion, instead of replacing pprof_file.c_str() by "" 
(in the
printf), replace it by pprof_file_arg.  Does the test still pass?

} Wrong exit code: expected: '1'; actual: 11

I think this means the test seg-faulted (the memory-leak is an expected error 
for
this particular test).  Can you try running the test manually?  I think it will 
be
something like
   env HEAP_CHECKER_TEST_TEST_LEAK=1 HEAP_CHECKER_TEST_NO_THREADS=1 HEAPCHECK=strict
./heap-checker_unittest

Original comment by csilv...@gmail.com on 14 May 2009 at 6:51

GoogleCodeExporter commented 9 years ago
> replace it by pprof_file_arg.  Does the test still pass?

This is infact the first thing that I tried and it does pass, but that comment 
above
the assignment discouraged me from posting it :).

> I think this means the test seg-faulted

Yes, I did not notice the segfault.  I'll report back with more information.

Original comment by app...@gmail.com on 14 May 2009 at 7:04

GoogleCodeExporter commented 9 years ago
I have more segfaults now.  All at the same location:

[snip...]

Exiting with error code (instead of crashing) because of whole-program memory 
leaks
FAIL: tcmalloc_both_unittest

[snip...]

./sampling_test.sh: line 80: 11257 Segmentation fault      (core dumped)
"$SAMPLING_TEST" "$OUTDIR/out"
Testing heap output...Adjusting heap profiles for 1-in-524288 sampling rate
Heap version 2
OK
Testing growth output...OK
PASS
PASS: sampling_test.sh
./heap-profiler_unittest.sh: line 126: 11430 Aborted                 
$HEAP_PROFILER 1
>$TEST_TMPDIR/output 2>&1
WARNING: Perftools heap leak checker is active -- Performance may suffer
No leaks found for check "_main_" (but no 100% guarantee that there aren't any):
found 15 reachable heap objects of 3387 bytes
Profile not found: /tmp/heap_profile_info/test.1329.heap
FAIL: heap-profiler_unittest.sh
Testing ./heap-checker_unittest with HEAPCHECK= ... ./heap-checker_unittest.sh: 
line
82: 11477 Segmentation fault      (core dumped) $HEAP_CHECKER >$TMPDIR/output 
2>&1
FAILED
Output from the failed run:

[snip...]

Testing ./heap-checker_unittest with HEAPCHECK= ... PASS
Testing ./heap-checker_unittest with HEAP_CHECKER_TEST_NO_THREADS=1 ... FAIL
Wrong exit code: expected: '0'; actual: 11
Output did not match '^PASS$'

[snip...]

$ gdb ./.libs/lt-tcmalloc_large_unittest core.11101
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...

warning: core file may not match specified executable file.
Core was generated by
`/u/giridhar/junk/google-perftools-1.2/.libs/lt-tcmalloc_large_unittest'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from 
/u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0...done.
Loaded symbols for /u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0
Reading symbols from /usr/lib/libstdc++.so.5...done.
Loaded symbols for /usr/lib/libstdc++.so.5
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/libpthread.so.0...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  SuggestPprofCommand (pprof_file_arg=0x403bc650
"/tmp/lt-tcmalloc_large_unittest.11101._main_-end.heap") at char_traits.h:119
119           { __c1 = __c2; }
(gdb) where
#0  SuggestPprofCommand (pprof_file_arg=0x403bc650
"/tmp/lt-tcmalloc_large_unittest.11101._main_-end.heap") at char_traits.h:119
#1  0x4002d54f in HeapLeakChecker::DoNoLeaks(HeapLeakChecker::ShouldSymbolize)
(this=0x80cf020, should_symbolize=SYMBOLIZE) at src/heap-checker.cc:1733
#2  0x4002eab7 in HeapLeakChecker::NoGlobalLeaks() () at 
src/heap-checker.cc:2028
#3  0x4002e982 in HeapLeakChecker::DoMainHeapCheck() () at 
src/heap-checker.cc:1999
#4  0x4002f3fb in HeapLeakChecker_AfterDestructors() () at 
src/heap-checker.cc:2275
#5  0x40032325 in __tcf_0 () at src/heap-checker-bcad.cc:88
#6  0x401d8ec5 in __cxa_finalize () from /lib/libc.so.6
#7  0x40025e45 in __do_global_dtors_aux () from
/u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0
#8  0x400423ba in _fini () from
/u/giridhar/junk/google-perftools-1.2/.libs/libtcmalloc.so.0
#9  0x4000af87 in _dl_fini () from /lib/ld-linux.so.2
#10 0x401d8c5e in exit () from /lib/libc.so.6
#11 0x401c5635 in __libc_start_main () from /lib/libc.so.6
(gdb) f 0
#0  SuggestPprofCommand (pprof_file_arg=0x403bc650
"/tmp/lt-tcmalloc_large_unittest.11101._main_-end.heap") at char_traits.h:119
119           { __c1 = __c2; }
(gdb) l
114           typedef streamoff         off_type;
115           typedef mbstate_t         state_type;
116
117           static void 
118           assign(char_type& __c1, const char_type& __c2)
119           { __c1 = __c2; }
120
121           static bool 
122           eq(const char_type& __c1, const char_type& __c2)
123           { return __c1 == __c2; }
(gdb) q

I am running with the following patch now.

$ diff -Nur orig/google-perftools-1.2 google-perftools-1.2
diff -Nur orig/google-perftools-1.2/src/base/vdso_support.cc
google-perftools-1.2/src/base/vdso_support.cc
--- orig/google-perftools-1.2/src/base/vdso_support.cc  2009-03-06 
01:00:47.000000000
+0530
+++ google-perftools-1.2/src/base/vdso_support.cc       2009-05-15 
01:04:55.000000000
+0530
@@ -504,6 +504,12 @@
   int ret_code = (*VDSOSupport::getcpu_fn_)(&cpu, NULL, NULL);
   return ret_code == 0 ? cpu : ret_code;
 }
+
+#if (10000 * __GNUC__ + 100 * __GNUC_MINOR__ + __GNUC_PATCHLEVEL__) < 30302
+// GCC 3.3.1 or below do not have proper support for attribute((constructor))
+struct VDSOInitHelper { VDSOInitHelper() { VDSOSupport::Init(); } };
+static VDSOInitHelper vdso_init_helper;
+#endif
 }

 #endif  // HAVE_VDSO_SUPPORT
diff -Nur orig/google-perftools-1.2/src/base/vdso_support.h
google-perftools-1.2/src/base/vdso_support.h
--- orig/google-perftools-1.2/src/base/vdso_support.h   2009-03-05 
21:41:18.000000000
+0530
+++ google-perftools-1.2/src/base/vdso_support.h        2009-05-15 
01:04:12.000000000
+0530
@@ -61,7 +61,7 @@
 // symbol extensions in glibc, but for right now we need them.
 #if defined(__ELF__) && defined(HAVE_ELF32_VERSYM)

-#define HAVE_VDSO_SUPPORT 1
+#define HAVE_VDSO_SUPPORT

 #include <stdlib.h>  // for NULL
 #include <link.h>  // for ElfW
diff -Nur orig/google-perftools-1.2/src/heap-checker.cc
google-perftools-1.2/src/heap-checker.cc
--- orig/google-perftools-1.2/src/heap-checker.cc       2009-04-18 
02:47:48.000000000
+0530
+++ google-perftools-1.2/src/heap-checker.cc    2009-05-15 01:09:06.000000000 
+0530
@@ -1555,9 +1555,6 @@
 // about the reported leaks.  We have to suggest extra commands
 // for programs run on borg/mrtest/blaze.
 static void SuggestPprofCommand(const char* pprof_file_arg) {
-  // Copy argument since we may mutate it later
-  string pprof_file = pprof_file_arg;
-
   // Extra help information to print for the user when the test is
   // being run in a way where the straightforward pprof command will
   // not suffice.
@@ -1588,7 +1585,7 @@
           fetch_cmd.c_str(),
           flags_heap_profile_pprof->c_str(),
           invocation_path().c_str(),
-          pprof_file.c_str(),
+          pprof_file_arg,
           extra_help.c_str()
           );
 }
@@ -2150,7 +2147,8 @@
 // static
 void HeapLeakChecker::TurnItselfOffLocked() {
   RAW_DCHECK(heap_checker_lock.IsHeld(), "");
-  FLAGS_heap_check = "";  // for users who test for it
+  if (!FLAGS_heap_check.empty())
+    FLAGS_heap_check.clear();  // for users who test for it
   if (constructor_heap_profiling) {
     RAW_CHECK(heap_checker_on, "");
     RAW_VLOG(heap_checker_info_level, "Turning perftools heap leak checking off");

Original comment by app...@gmail.com on 14 May 2009 at 7:58

GoogleCodeExporter commented 9 years ago
Hmm, these crashes in SuggestPprofCommand sure smell like they're related to the
other problem you described earlier, which reported a leak from 
SuggestPprofCommand.
 I wonder if the underlying problem is that SuggestPprofCommand is just running in a
context where no memory operators are allowed (at least on RH9).  We are 
running this
in a global destructor.

Try removing all the rest of the string assignemnts in SuggestPprofCommand, and 
just
doing
  RAW_LOG(WARNING, ..., "", "", "", "", "");
See if things work properly then.

Original comment by csilv...@gmail.com on 15 May 2009 at 3:43