hzshuai / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Unit tests failing for Ubuntu 11.10 #437

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Download current development snapshot on Ubuntu 11.10, build, and run unit 
tests (./configure && make && make check)
2. Note that 3 unit tests fail.

What is the expected output? What do you see instead?
We expect all unit tests to pass. Instead we see a failure with 
heap-checker_unittest.sh producing the following output:

david@hatch:~/gperftools$ ./heap-checker_unittest.sh 
Testing ./heap-checker_unittest with HEAPCHECK= ... OK
Testing ./heap-checker_unittest with HEAPCHECK=local ... OK
Testing ./heap-checker_unittest with HEAPCHECK=normal ... FAILED
Output from the failed run:
----
WARNING: Perftools heap leak checker is active -- Performance may suffer

Adding pthread-specifics for thread 3078514400 pid 3592
Creating extra thread 1
Creating extra thread 2
Creating extra thread 3
Creating extra thread 4
Creating extra thread 5
Creating extra thread 6
Creating extra thread 7
Creating extra thread 8
Creating extra thread 9
Creating extra thread 10
Creating extra thread 11
Creating extra thread 12
Creating extra thread 13
Creating extra thread 14
Creating extra thread 15
Creating extra thread 16
Creating extra thread 17
A new HeapBusyThread 14
Adding pthread-specifics for thread 2960153456 pid 3592
A new HeapBusyThread 16
Adding pthread-specifics for thread 2943368048 pid 3592
A new HeapBusyThread 6
Adding pthread-specifics for thread 3027295088 pid 3592
A new HeapBusyThread 15
Adding pthread-specifics for thread 2951760752 pid 3592
A new HeapBusyThread 8
Adding pthread-specifics for thread 3010509680 pid 3592
A new HeapBusyThread 10
Adding pthread-specifics for thread 2993724272 pid 3592
A new HeapBusyThread 11
Adding pthread-specifics for thread 2985331568 pid 3592
A new HeapBusyThread 12
Adding pthread-specifics for thread 2976938864 pid 3592
A new HeapBusyThread 7
Adding pthread-specifics for thread 3018902384 pid 3592
A new HeapBusyThread 9
Adding pthread-specifics for thread 3002116976 pid 3592
A new HeapBusyThread 13
Adding pthread-specifics for thread 2968546160 pid 3592
A new HeapBusyThread 5
Adding pthread-specifics for thread 3035687792 pid 3592
A new HeapBusyThread 2
Adding pthread-specifics for thread 3060865904 pid 3592
A new HeapBusyThread 1
Adding pthread-specifics for thread 3069258608 pid 3592
A new HeapBusyThread 0
Adding pthread-specifics for thread 3077651312 pid 3592
A new HeapBusyThread 3
Adding pthread-specifics for thread 3052473200 pid 3592
A new HeapBusyThread 4
Adding pthread-specifics for thread 3044080496 pid 3592
Adding pthread-specifics for thread 3078514400 pid 3592
In main(): heap_check=normal
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "_main_" (but no 100% guarantee that there aren't 
any): found 1332 reachable heap objects of 127383 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "trivial" (but no 100% guarantee that there aren't 
any): found 2039 reachable heap objects of 145055 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "simple" (but no 100% guarantee that there aren't 
any): found 2079 reachable heap objects of 146210 bytes

Pre leaking : 0xf8434f7b ^ 0xf03a5f7b

Pre leaking : 0xf843757b ^ 0xf03a5f7b

Leaking : 0xf8465f7b ^ 0xf03a5f7b

Leaking : 0xf8461f7b ^ 0xf03a5f7b

Leaking : 0xf841ff7b ^ 0xf03a5f7b

Leaking : 0xf846df7b ^ 0xf03a5f7b

Leaking : 0xf84d1aab ^ 0xf03a5f7b

Leaking : 0xf84d199b ^ 0xf03a5f7b

Pre leaking : 0xf8434f7b ^ 0xf03a5f7b

Leaking : 0xf841ff7b ^ 0xf03a5f7b
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "death_noleaks" (but no 100% guarantee that there 
aren't any): found 2291 reachable heap objects of 150481 bytes

Pre leaking : 0xf84d9b6b ^ 0xf03a5f7b

Pre leaking : 0xf84d9c3b ^ 0xf03a5f7b

Leaking : 0xf841ff7b ^ 0xf03a5f7b

Pre leaking : 0xf841ff7b ^ 0xf03a5f7b

Leaking : 0xf84d9c3b ^ 0xf03a5f7b

Leaking : 0xf84d9b6b ^ 0xf03a5f7b
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "_main_" (but no 100% guarantee that there aren't 
any): found 2417 reachable heap objects of 152643 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "trivial_p" (but no 100% guarantee that there aren't 
any): found 2459 reachable heap objects of 153457 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "simple_p" (but no 100% guarantee that there aren't 
any): found 2423 reachable heap objects of 152736 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "disabling" (but no 100% guarantee that there aren't 
any): found 2409 reachable heap objects of 152367 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "stl" (but no 100% guarantee that there aren't any): 
found 2457 reachable heap objects of 153677 bytes

Leaking : 0xf841ac5f ^ 0xf03a5f7b
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-std::allocator<char>()" (but no 100% 
guarantee that there aren't any): found 2374 reachable heap objects of 151691 
bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-std::allocator<int>()" (but no 100% 
guarantee that there aren't any): found 2409 reachable heap objects of 152746 
bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-std::string().get_allocator()" (but no 
100% guarantee that there aren't any): found 2408 reachable heap objects of 
152378 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-string().get_allocator()" (but no 100% 
guarantee that there aren't any): found 2441 reachable heap objects of 153033 
bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-vector<int>().get_allocator()" (but no 
100% guarantee that there aren't any): found 2413 reachable heap objects of 
152478 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-vector<double>().get_allocator()" (but no 
100% guarantee that there aren't any): found 2382 reachable heap objects of 
151861 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-vector<vector<int> >().get_allocator()" 
(but no 100% guarantee that there aren't any): found 2422 reachable heap 
objects of 152667 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-vector<string>().get_allocator()" (but no 
100% guarantee that there aren't any): found 2437 reachable heap objects of 
152961 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-(map<string, string>().get_allocator())" 
(but no 100% guarantee that there aren't any): found 2470 reachable heap 
objects of 153628 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-(map<string, int>().get_allocator())" (but 
no 100% guarantee that there aren't any): found 2501 reachable heap objects of 
154245 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "direct_stl-set<char>().get_allocator()" (but no 100% 
guarantee that there aren't any): found 2540 reachable heap objects of 155016 
bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "_main_" (but no 100% guarantee that there aren't 
any): found 2478 reachable heap objects of 154097 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "all" (but no 100% guarantee that there aren't any): 
found 2514 reachable heap objects of 154493 bytes
Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.
No leaks found for check "_main_" (but no 100% guarantee that there aren't 
any): found 2513 reachable heap objects of 154473 bytes
PASS
Check failed: !do_main_heap_check: should have done it
Aborted
----

Please use labels and text to provide additional information.

At first glance, it looks like heap-checker_unittest.sh is the main issue. 
Primarily, this looks like an issue with ptrace changing for Ubuntu 11.10 which 
requires some additional permission settings. Likely changed as a result of a 
security vulnerability with ptrace. Need to dig in a bit further. I took an 
initial stab at figuring this out by making the following modification to 
src/base/linuxthreads.cc just to see what is going on:

david@hatch:~/gperftools$ diff -u ./src/base/linuxthreads.cc 
./src/base/linuxthreads.cc.bak 
--- ./src/base/linuxthreads.cc  2012-06-08 22:16:36.944608135 -0400
+++ ./src/base/linuxthreads.cc.bak  2012-06-08 22:15:58.964419792 -0400
@@ -39,12 +39,14 @@
 #endif

 #include <sched.h>
+#include <stdio.h>
 #include <signal.h>
 #include <stdlib.h>
 #include <string.h>
 #include <fcntl.h>
 #include <sys/socket.h>
 #include <sys/wait.h>
+#include <sys/prctl.h>

 #include "base/linux_syscall_support.h"
 #include "base/thread_lister.h"
@@ -254,6 +256,8 @@
   struct kernel_stat marker_sb, proc_sb;
   stack_t            altstack;

+  printf( "lister 1\n" );
+
   /* Create "marker" that we can use to detect threads sharing the same
    * address space and the same file handles. By setting the FD_CLOEXEC flag
    * we minimize the risk of misidentifying child processes as threads;
@@ -274,6 +278,8 @@
     sys__exit(1);
   }

+  printf( "lister 2\n" );
+
   /* Compute search paths for finding thread directories in /proc            */
   local_itoa(strrchr(strcpy(proc_self_task, "/proc/"), '\000'), ppid);
   strcpy(marker_name, proc_self_task);
@@ -314,9 +320,12 @@
     sa.sa_flags      = SA_ONSTACK|SA_SIGINFO|SA_RESETHAND;
     sys_sigaction(sync_signals[sig], &sa, (struct kernel_sigaction *)NULL);
   }
+
+  printf( "lister 3 - reading process directories in /proc/...\n" );

   /* Read process directories in /proc/...                                   */
   for (;;) {
+    printf( "outer loop\n" );
     /* Some kernels know about threads, and hide them in "/proc"
      * (although they are still there, if you know the process
      * id). Threads are moved into a separate "task" directory. We
@@ -324,8 +333,10 @@
      * convention if necessary.
      */
     if ((sig_proc = proc = c_open(*proc_path, O_RDONLY|O_DIRECTORY, 0)) < 0) {
-      if (*++proc_path != NULL)
+      if (*++proc_path != NULL) {
+        printf( "advancing proc_path\n" );
         continue;
+      }
       goto failure;
     }
     if (sys_fstat(proc, &proc_sb) < 0)
@@ -355,6 +366,9 @@
         char buf[4096];
         ssize_t nbytes = sys_getdents(proc, (struct kernel_dirent *)buf,
                                       sizeof(buf));
+
+        printf( "inner loop\n" );
+
         if (nbytes < 0)
           goto failure;
         else if (nbytes == 0) {
@@ -384,10 +398,14 @@
             /* If the directory is not numeric, it cannot be a
              * process/thread
              */
-            if (*ptr < '0' || *ptr > '9')
+            if (*ptr < '0' || *ptr > '9') {
+              printf( "found non numeric directory ... continuing\n" );
               continue;
+            }
             pid = local_atoi(ptr);

+           printf( "pid=%d clone_pid=%d\n", pid, clone_pid );
+
             /* Attach (and suspend) all threads                              */
             if (pid && pid != clone_pid) {
               struct kernel_stat tmp_sb;
@@ -423,6 +441,7 @@
                 sig_num_threads     = num_threads;
                 if (sys_ptrace(PTRACE_ATTACH, pid, (void *)0,
                                (void *)0) < 0) {
+                  printf( "failed attaching to thread %d errno=%d\n", pid, 
errno );
                   /* If operation failed, ignore thread. Maybe it
                    * just died?  There might also be a race
                    * condition with a concurrent core dumper or
@@ -436,6 +455,7 @@
                 }
                 while (sys_waitpid(pid, (int *)0, __WALL) < 0) {
                   if (errno != EINTR) {
+                    printf( "failed waiting for thread %d errno=%d\n", pid, 
errno );
                     sys_ptrace_detach(pid);
                     num_threads--;
                     sig_num_threads = num_threads;
@@ -443,6 +463,7 @@
                   }
                 }

+                printf( "calling sys_ptrace PTRACE_PEEKDATA\n" );
                 if (sys_ptrace(PTRACE_PEEKDATA, pid, &i, &j) || i++ != j ||
                     sys_ptrace(PTRACE_PEEKDATA, pid, &i, &j) || i   != j) {
                   /* Address spaces are distinct, even though both
@@ -454,6 +475,7 @@
                   sig_num_threads = num_threads;
                 } else {
                   found_parent |= pid == ppid;
+                  printf( "set found_parent=%d\n", found_parent );
                   added_entries++;
                 }
               }
@@ -601,6 +623,7 @@
     clone_pid = local_clone((int (*)(void *))ListerThread, &args);
     clone_errno = errno;

+   prctl(PR_SET_PTRACER, clone_pid, 0, 0, 0);
     sys_sigprocmask(SIG_SETMASK, &sig_old, &sig_old);

     if (clone_pid >= 0) {

We need to continue on this train of investigation and find a solution to 
getting the correct permissions setup to allow ptrace to do its thang.

Original issue reported on code.google.com by chapp...@gmail.com on 9 Jun 2012 at 2:28

GoogleCodeExporter commented 9 years ago
I think I see the same issue on Ubuntu 12.10 (with a patched glibc to resolve a 
deadlock issue).

I was able to capture a backtrace using gdb:

#0  0x00007ffff70a5445 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x00007ffff70a8bab in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2  0x00007ffff7b86d98 in HeapLeakChecker_AfterDestructors () at 
src/heap-checker.cc:2315
        l = {lock_ = 0x7ffff7db2d60}
#3  0x00007ffff70aad3d in __cxa_finalize () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#4  0x00007ffff7b78883 in __do_global_dtors_aux () from 
.libs/libtcmalloc_debug.so.4
No symbol table info available.
#5  0x00007fffffffdba0 in ?? ()
No symbol table info available.
#6  0x00007ffff7de990e in ?? () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Original comment by ringo.p...@gmail.com on 24 Jul 2012 at 7:04

GoogleCodeExporter commented 9 years ago
ptrace default behaviour did change in 10.10 onwards:

https://wiki.ubuntu.com/Security/Features#ptrace

Setting the ptrace scope to be more permissive removed the:

Thread finding failed with -1 errno=1
Could not find thread stacks. Will likely report false leak positives.

error messages but the unittest still aborts with:

Check failed: !do_main_heap_check: should have done it
Aborted (core dumped)

Original comment by ringo.p...@gmail.com on 24 Jul 2012 at 11:42

GoogleCodeExporter commented 9 years ago
Do you have a patch you can share for fixing the ptrace issue? As for the other 
problem, I ran into this a while back while doing some FreeBSD porting work:

    http://code.google.com/p/gperftools/issues/detail?id=375

Applying the same work around as in issue 375 gets things working again. Seems 
to be an issue with how things are being ordered/executed by __at_exit.

Original comment by chapp...@gmail.com on 24 Jul 2012 at 3:58

GoogleCodeExporter commented 9 years ago
Not yet - I just used yama to set ptrace to be more permissive at the system 
level which is obviously not a long term fix.

I'll try the workaround in issue 375 as well - thanks for the pointer.

Original comment by ringo.p...@gmail.com on 24 Jul 2012 at 4:47

GoogleCodeExporter commented 9 years ago
Interestingly the ptrace permissions don't effect the results of the unittest - 
but I suspect that it will create problems.

Original comment by ringo.p...@gmail.com on 24 Jul 2012 at 5:11

GoogleCodeExporter commented 9 years ago
I did a quick hacky test a while back to see if the programatic approach works 
for setting permissions and it does:

prctl(PR_SET_PTRACER, debugger_pid, 0, 0, 0)

Haven't had time to work out the details though since it requires some form of 
interprocess syncronization. We only get the pid after spawning the child 
process that is going to ptrace us. So we need a way to tell the child to wait 
until we have given it sufficient permissions. Feel free to press forward with 
a patch if you have cycles :)

Original comment by chapp...@gmail.com on 24 Jul 2012 at 5:20

GoogleCodeExporter commented 9 years ago
This is the system level fix for the ptrace permissions:

echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope

This does make the test suite run reliably; I will take a look at a fix in the 
code base but this won't be for a few days.

Original comment by ringo.p...@gmail.com on 24 Jul 2012 at 5:23

GoogleCodeExporter commented 9 years ago
OK; this is a first stab at a patch which I think resolves this issue.

It uses base/simple_mutex.h to facilitate sync between the parent and child 
threads to ensure that appropriate ptrace permissions are set before the child 
thread interrogates the parent.

Original comment by ringo.p...@gmail.com on 25 Jul 2012 at 1:31

Attachments:

GoogleCodeExporter commented 9 years ago
Note that I am also running test with 

a) a patched version of glibc 2.15 on Ubuntu 12.10 (this fix should land in 
both 12.10 and 12.04 - see http://pad.lv/1028038)

b) the attached patch to ensure that heap check always occurs after destructors 
otherwise I see the same issue as seen on FreeBSD.

Original comment by ringo.p...@gmail.com on 25 Jul 2012 at 1:37

Attachments:

GoogleCodeExporter commented 9 years ago
Revised patch which uses semaphore directly so that we have more control over 
creation and destruction.

Original comment by ringo.p...@gmail.com on 28 Jul 2012 at 8:58

Attachments:

GoogleCodeExporter commented 9 years ago
Thank you very much for the patches. I have applied and tested on Ubuntu 11.04. 
Patches are now committed to the main trunk.

------------------------------------------------------------------------
r152 | chappedm@gmail.com | 2012-09-17 20:00:20 -0400 (Mon, 17 Sep 2012) | 12 
lines

issue-437 Fixed issues related to new glibc shipped with Ubuntu 10.10

1. ptrace permissions were modifed to be a bit more strict which required
   us to programatically set the permissions while syncing up to the profiling
   thread.

2. Order of destructors registered with atexit changed which was casuing us to
   miss generating the backtrace when heap checker was finished. Seems that we
   initially fixed this for FreeBSD and now linux has changed their behaviour
   to be the same. We are now a bit stricter on the rules here accross all
   platforms.

------------------------------------------------------------------------

Original comment by chapp...@gmail.com on 18 Sep 2012 at 12:14

GoogleCodeExporter commented 9 years ago
Issue 432 has been merged into this issue.

Original comment by chapp...@gmail.com on 23 Dec 2012 at 3:10

GoogleCodeExporter commented 9 years ago
Issue 500 has been merged into this issue.

Original comment by chapp...@gmail.com on 10 Mar 2013 at 8:38