ddio / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

gcc 4.5 breaks tcmalloc #236

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Compile google-perftools with gcc 4.5
2. Run the test suite

What is the expected output? What do you see instead?

A lot of tests fail that shouldn't. E.g.:

pipping@bogus ~/Downloads/google-perftools-1.5 $ make check 
TESTS=tcmalloc_unittest
rm -f debugallocation_test.sh
cp -p ./src/tests/debugallocation_test.sh debugallocation_test.sh
rm -f sampling_test.sh
cp -p ./src/tests/sampling_test.sh sampling_test.sh
rm -f heap-profiler_unittest.sh
cp -p ./src/tests/heap-profiler_unittest.sh heap-profiler_unittest.sh
rm -f heap-checker_unittest.sh
cp -p ./src/tests/heap-checker_unittest.sh heap-checker_unittest.sh
rm -f heap-checker-death_unittest.sh
cp -p ./src/tests/heap-checker-death_unittest.sh heap-checker-
death_unittest.sh
rm -f sampling_debug_test.sh
cp -p ./src/tests/sampling_test.sh sampling_debug_test.sh
rm -f heap-profiler_debug_unittest.sh
cp -p ./src/tests/heap-profiler_unittest.sh heap-profiler_debug_unittest.sh
rm -f heap-checker_debug_unittest.sh
cp -p ./src/tests/heap-checker_unittest.sh heap-checker_debug_unittest.sh
rm -f profiler_unittest.sh
cp -p ./src/tests/profiler_unittest.sh profiler_unittest.sh
for la in libtcmalloc_minimal.la libtcmalloc_minimal_debug.la 
libtcmalloc.la libtcmalloc_debug.la libtcmalloc_and_profiler.la; do 
lib=".libs/`basename $la .la`.a"; [ ! -f "$lib" ] || objcopy -W malloc -W 
free -W realloc -W calloc -W cfree -W memalign -W posix_memalign -W valloc 
-W pvalloc -W malloc_stats -W mallopt -W mallinfo -W _Znwm -W 
_ZnwmRKSt9nothrow_t -W _Znam -W _ZnamRKSt9nothrow_t -W _ZdlPv -W _ZdaPv -W 
__Znwm -W __ZnwmRKSt9nothrow_t -W __Znam -W __ZnamRKSt9nothrow_t -W __ZdlPv 
-W __ZdaPv "$lib"; done
make  pprof_unittest
make[1]: Entering directory `/home/pipping/Downloads/google-perftools-1.5'
./src/pprof -test
AddressAdd 32-bit tests: 5 passes, 0 failures
AddressAdd 64-bit tests: 10 passes, 0 failures
AddressSub 32-bit tests: 5 passes, 0 failures
AddressSub 64-bit tests: 10 passes, 0 failures
AddressInc 32-bit tests: 5 passes, 0 failures
AddressInc 64-bit tests: 10 passes, 0 failures
PASS
make[1]: Leaving directory `/home/pipping/Downloads/google-perftools-1.5'
make  check-TESTS
make[1]: Entering directory `/home/pipping/Downloads/google-perftools-1.5'
Testing empty allocation
Testing STL use
Sanity-testing all the memory allocation functions
Testing large allocation
Testing alignment of malloc(0)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(3)
Testing alignment of malloc(3)
Testing alignment of malloc(4)
Testing alignment of malloc(5)
Testing alignment of malloc(7)
Testing alignment of malloc(8)
Testing alignment of malloc(9)
Testing alignment of malloc(15)
Testing alignment of malloc(16)
Testing alignment of malloc(17)
Testing alignment of malloc(31)
Testing alignment of malloc(32)
/bin/sh: line 4:  6509 Segmentation fault      PPROF_PATH=./src/pprof 
BINDIR=. TMPDIR=/tmp/perftools ${dir}$tst
FAIL: tcmalloc_unittest
======================================
1 of 1 tests failed
Please report to opensource@google.com
======================================
make[1]: *** [check-TESTS] Error 1
make[1]: Leaving directory `/home/pipping/Downloads/google-perftools-1.5'
make: *** [check-am] Error 2
pipping@bogus ~/Downloads/google-perftools-1.5 $ 

What version of the product are you using? On what operating system?

perftools 1.5 on 64bit linux 2.6.33.2

Please provide any additional information below.

pipping@bogus ~/Downloads/google-perftools-1.5 $ libtool execute gdb  
./tcmalloc_unittest
./tcmalloc_unittest: line 60: X./tcmalloc_unittest: No such file or 
directory
GNU gdb (GDB) 7.1
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/pipping/Downloads/google-perftools-1.5/.libs/lt-
tcmalloc_unittest...done.
(gdb) run
Starting program: /home/pipping/Downloads/google-perftools-1.5/.libs/lt-
tcmalloc_unittest 
warning: no loadable sections found in added symbol-file 
/usr/lib64/debug/lib64/ld-2.11.1.so.debug
warning: no loadable sections found in added symbol-file 
/usr/lib64/debug/usr/lib64/libstdc++.so.6.0.14.debug
warning: no loadable sections found in added symbol-file 
/usr/lib64/debug/usr/lib64/libgcc_s-4.5.so.1.debug
[Thread debugging using libthread_db enabled]
Testing empty allocation
Testing STL use
Sanity-testing all the memory allocation functions
Testing large allocation
Testing alignment of malloc(0)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(3)
Testing alignment of malloc(3)
Testing alignment of malloc(4)
Testing alignment of malloc(5)
Testing alignment of malloc(7)
Testing alignment of malloc(8)
Testing alignment of malloc(9)
Testing alignment of malloc(15)
Testing alignment of malloc(16)
Testing alignment of malloc(17)
Testing alignment of malloc(31)
Testing alignment of malloc(32)

Program received signal SIGSEGV, Segmentation fault.
tcmalloc::CentralFreeList::FetchFromSpans (this=0x7ffff7dbe780) at 
src/central_freelist.cc:249
249       span->objects = *(reinterpret_cast<void**>(result));
(gdb) bt
#0  tcmalloc::CentralFreeList::FetchFromSpans (this=0x7ffff7dbe780) at 
src/central_freelist.cc:249
#1  0x00007ffff7b96161 in tcmalloc::CentralFreeList::FetchFromSpansSafe 
(this=0x7ffff7dbe780)
    at src/central_freelist.cc:234
#2  0x00007ffff7b961f2 in tcmalloc::CentralFreeList::RemoveRange 
(this=0x7ffff7dbe780, start=0x7fffffffddb8, 
    end=0x7fffffffddb0, N=30) at src/central_freelist.cc:214
#3  0x00007ffff7b98def in tcmalloc::ThreadCache::FetchFromCentralCache 
(this=0x86a000, cl=<value optimized out>, 
    byte_size=32) at src/thread_cache.cc:149
#4  0x00007ffff7ba5105 in tcmalloc::ThreadCache::Allocate (size=32) at 
src/thread_cache.h:340
#5  do_malloc (size=32) at src/tcmalloc.cc:917
#6  do_malloc_or_cpp_alloc (size=32) at src/tcmalloc.cc:870
#7  tc_malloc (size=32) at src/tcmalloc.cc:1308
#8  0x0000000000402218 in TestAlignmentForSize (size=32) at 
src/tests/tcmalloc_unittest.cc:714
#9  0x000000000040357f in TestMallocAlignment (argc=<value optimized out>, 
argv=<value optimized out>)
    at src/tests/tcmalloc_unittest.cc:734
#10 RunAllTests (argc=<value optimized out>, argv=<value optimized out>) at 
src/tests/tcmalloc_unittest.cc:1133
#11 main (argc=<value optimized out>, argv=<value optimized out>) at 
src/tests/tcmalloc_unittest.cc:1216
(gdb) 

(I think the line "./tcmalloc_unittest: line 60: X./tcmalloc_unittest: No 
such file or directory" can be ignored -- it's caused by using libtool to 
invoke gdb)

Original issue reported on code.google.com by pipping....@gmail.com on 25 Apr 2010 at 11:27

GoogleCodeExporter commented 9 years ago
Hmm, I can't reproduce this.  I got my hands on a gcc 4.5 SVN 20100326 build, 
and 
tried building perftools on my x86_64 machine (ubuntu 8.04, linux 2.6.24).  All 
tests 
passed for me.  This is against perftools svn-head rather than 1.5, but I don't 
expect that to make a difference, certainly not to the extent you saw.

What happens if you run ./tcmalloc_unittest by hand (rather than as part of 
make 
check)?  Does tcmalloc_debug_unittest also fail? -- it may give a better error 
message if it does.

I'm afraid I can't really help much figuring out what's going on, given that 
this 
works for me.  If you're able to poke around with gdb a bit and see what might 
be 
going wrong, that would definitely be helpful.

Original comment by csilv...@gmail.com on 26 Apr 2010 at 7:20

GoogleCodeExporter commented 9 years ago
Here are my config.log and test.log for version 1.5

Original comment by pipping....@gmail.com on 26 Apr 2010 at 11:29

Attachments:

GoogleCodeExporter commented 9 years ago
Here are my config.log and test.log for HEAD.

The additional failing test can probably be ignored.

Original comment by pipping....@gmail.com on 26 Apr 2010 at 11:30

Attachments:

GoogleCodeExporter commented 9 years ago
pipping@bogus ~/Downloads/google-perftools-1.5 $ 
./tcmalloc_minimal_debug_unittest 
Testing empty allocation
Testing STL use
Sanity-testing all the memory allocation functions
Testing large allocation
Testing alignment of malloc(0)
Testing alignment of malloc(1)
Segmentation fault
pipping@bogus ~/Downloads/google-perftools-1.5 $ 

Original comment by pipping....@gmail.com on 27 Apr 2010 at 12:33

GoogleCodeExporter commented 9 years ago
This is beginning to look an awful lot like

  http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6031

but then not a hundred percent. One way to trigger this bug is

  make distclean;
  ./configure CXX=g++-4.5 CXXFLAGS="-O1 -g3" --enable-frame-pointers
  make -j3
  ./tcmalloc_minimal_unittest 

one ends up with

[..]
Sanity-testing all the memory allocation functions
Testing large allocation
Testing alignment of malloc(0)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(3)
Testing alignment of malloc(3)
Testing alignment of malloc(4)
Testing alignment of malloc(5)
Testing alignment of malloc(7)
Testing alignment of malloc(8)
Testing alignment of malloc(9)
Testing alignment of malloc(15)
Testing alignment of malloc(16)
Testing alignment of malloc(17)
Testing alignment of malloc(31)
Testing alignment of malloc(32)
Segmentation fault

I tried to find out what optimization exactly is causing this. So i looked at 
the 
differences between -O1 and -O0:

  flags=( $(diff <(g++-4.5 -c -Q -O0 --help=optimizers) <(g++-4.5 -c -Q -O1 --
help=optimizers) | sed -e '/^[^>]/d' -e 's:^>   ::' -e 
'/\[enabled\]/s:[[:space:]]*\[enabled\]$::' -e 
'/\[disabled\]/{s:[[:space:]]*\[disabled\]$::;s:^-f:-fno-:}') )

Unfortunately,

  make distclean
  ./configure CXX=g++-4.5 CXXFLAGS="${flags[*]} -g3" --enable-frame-pointers
  make -j3
  ./tcmalloc_minimal_unittest

does *not* trigger the bug. So I have to assume it's one of the optimizations 
-O1 
enables that can't be disabled explicitly ( there's "Make -O2 (and other -On 
options) 
be a sum of individual -f* options, by adding a -fother-O2-optimizations (and 
same 
for other -On levels" here: http://gcc.gnu.org/wiki/SummerOfCode )

Original comment by pipping....@gmail.com on 27 Apr 2010 at 1:56

GoogleCodeExporter commented 9 years ago
Here's what happens in tcmalloc_minimal_unittest

Program received signal SIGSEGV, Segmentation fault.
tcmalloc::CentralFreeList::FetchFromSpans (this=0x7ffff7dc0720) at 
src/central_freelist.cc:249
249       span->objects = *(reinterpret_cast<void**>(result));
(gdb) bt
#0  tcmalloc::CentralFreeList::FetchFromSpans (this=0x7ffff7dc0720) at 
src/central_freelist.cc:249
#1  0x00007ffff7ba82fd in tcmalloc::CentralFreeList::FetchFromSpansSafe 
(this=0x7ffff7dc0720)
    at src/central_freelist.cc:234
#2  0x00007ffff7ba844b in tcmalloc::CentralFreeList::RemoveRange 
(this=0x7ffff7dc0720, start=0x7fffffffde08, 
    end=0x7fffffffde00, N=28) at src/central_freelist.cc:214
#3  0x00007ffff7baae0f in tcmalloc::ThreadCache::FetchFromCentralCache 
(this=<value 
optimized out>, cl=3, 
    byte_size=32) at src/thread_cache.cc:149
#4  0x00007ffff7bae1b2 in tcmalloc::ThreadCache::Allocate (size=32) at 
src/thread_cache.h:340
#5  do_malloc (size=32) at src/tcmalloc.cc:917
#6  do_malloc_or_cpp_alloc (size=32) at src/tcmalloc.cc:870
#7  tc_malloc (size=32) at src/tcmalloc.cc:1308
#8  0x00000000004021e1 in TestAlignmentForSize (size=32) at 
src/tests/tcmalloc_unittest.cc:714
#9  0x0000000000403631 in TestMallocAlignment (argc=<value optimized out>, 
argv=<value optimized out>)
    at src/tests/tcmalloc_unittest.cc:734
#10 RunAllTests (argc=<value optimized out>, argv=<value optimized out>) at 
src/tests/tcmalloc_unittest.cc:1133
#11 main (argc=<value optimized out>, argv=<value optimized out>) at 
src/tests/tcmalloc_unittest.cc:1216
(gdb) p span
$1 = (tcmalloc::Span *) 0x60a060
(gdb) p *span
$2 = {start = 2186, length = 1, next = 0x60a1e0, prev = 0x7ffff7dc0760, objects 
= 
0x0, refcount = 3, sizeclass = 3, 
  location = 0, sample = 0}
(gdb) p result
$3 = (void *) 0x0

Original comment by pipping....@gmail.com on 27 Apr 2010 at 2:05

GoogleCodeExporter commented 9 years ago
Aha!  The optimization is the key.  I tried building against with gcc 4.5 with 
-O2, 
rather than -O0, and now I get the same failure you see.  Of course, with 
optimization on, it's very difficult to debug... :-/

A failure in the line you pointed out just means that memory was corrupted 
somewhere.  
It's pretty much impossible to guess where that might have been.  I'm 
sympathetic to 
the idea that this is a bug in gcc, since a) it doesn't occur in earlier gcc's, 
and 
b) it doesn't occur even in gcc 4.5, in debug mode.  This isn't conclusive, but 
it 
does point to the most likely problem being a bug introduced into the gcc 
optimizer.

I'll try to look into this, but it's going to be challenging, especially if 
it's a 
compiler bug.

Original comment by csilv...@gmail.com on 27 Apr 2010 at 4:05

GoogleCodeExporter commented 9 years ago
The good news is that I mentioned earlier you don't need -O2. Instead, -O1 will 
do.

Original comment by pipping....@gmail.com on 27 Apr 2010 at 5:48

GoogleCodeExporter commented 9 years ago
Alas, -O1 already has lots of variable values being optimized out, so it's hard 
to 
figure out what might be going on.

I've verified tcmalloc_minimal_unittest passes with -O2 in gcc 4.4.3, so it 
seem to be 
an issue that only comes up with 4.5.  Whether it's a bug in 4.5, or a bug in 
tcmalloc 
that the new optimizer brings out, is still an open question.

Original comment by csilv...@gmail.com on 27 Apr 2010 at 8:33

GoogleCodeExporter commented 9 years ago
The attached patch makes it work for me. I'm by no means an expert C++ 
programmer,
but it looks like a genuine compiler bug (since what appears to me a semantic 
placebo
fixes it). The code that gets miscompiled by gcc 4.5 with -O1 and up is the 
last two
code lines in CentralFreeList::ReleaseToSpans, prepending the object to the
span->objects singly linked list. Found by resurrecting a modified version of
DLL_Print from span.[h|cc].

Original comment by zeev.tar...@gmail.com on 22 May 2010 at 1:34

Attachments:

GoogleCodeExporter commented 9 years ago
For what it's worth, gcc built from the code that's currently in the 4.5 branch 
does
not exhibit this bug and correctly compiles unmodified tcmalloc code in a way 
that
passes this test and the assembly looks sane. So I guess it's going to be fixed 
in
gcc 4.5.1, whenever it's out.

Original comment by zeev.tar...@gmail.com on 24 May 2010 at 8:19

GoogleCodeExporter commented 9 years ago
Nice job figuring out a fix for this.  Do you think it's worth fixing in 
tcmalloc?  Is 
gcc 4.5 part of any major distribution, do you know?  If gcc 4.5.1 will fix it, 
maybe 
it's just best to leave this be, without complexifying the code.  On the other 
hand, 
we have a workaround, so maybe we should just use it.  Thoughts?

Original comment by csilv...@gmail.com on 24 May 2010 at 3:55

GoogleCodeExporter commented 9 years ago
gcc 4.5 is in current Arch and Gentoo unstable. It will be in the next Ubuntu 
and
Suse, and probably Fedora too (though they might ship binaries built with gcc 
4.4.4).
The fix is simple (small, localized) and low risk (especially if that "noinline"
function attribute is inside an ifdef that's checking that the compiler is in 
fact
gcc 4.5.0), but maybe unnecessary - that's up to you. You should keep the bug 
open
until a stable version of gcc is released that does not exhibit this behavior, 
though.

Original comment by zeev.tar...@gmail.com on 24 May 2010 at 7:34

GoogleCodeExporter commented 9 years ago
My testing and my report were incomplete. tcmalloc is broken with gcc 4.5 
trunk, it
just fails somewhere else. The unit tests fail (at "Testing out of memory") and
chromium built with tcmalloc crashes, so there are other problems. I'll try to 
find
what is miscompiled with gcc from 4.5 branch, but I can't debug gcc itself.

Original comment by zeev.tar...@gmail.com on 4 Jun 2010 at 9:36

GoogleCodeExporter commented 9 years ago
My analysis: The problem was in TestSetNewMode and it was that gcc optimizes 
with the
knowledge that malloc cannot modify the value of g_no_memory, because malloc is 
a
special library function. This knowledge is wrong because we overload malloc. 
But we
don't tell gcc that its assumptions about malloc are invalid. So the test 
fails: gcc
doesn't even bother to check the value in g_no_memory.

I asked on gcc's IRC channel, and was told that doing what this does is Wrong. I
agree with them: changing the semantics of malloc is a Bad Idea. It's a 
reasonable
assumption that malloc does not modify any user-visible state, so gcc has a 
right to
do what it did.

We can tell gcc that malloc is not really malloc
(http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Other-Builtins.html#Other-Builtins)
 but
we can't do it for the other functions, and it's a bad idea anyway.

Marking the flag volatile is a work-around for the testsuite, but in real 
usage, if
calling malloc can behind the scenes call a user-defined handler that calls 
free() on
a bunch of stuff, it will invalidate lots of assumptions the code had (like 
making
some pointers garbage, etc.). In essence, using tc_set_new_mode(1) turns every 
call
to malloc into a garbage-collection checkpoint, in a language where GC does not 
exist
and the compiler optimizes accordingly. A bad idea.

BTW, I don't know how this is implemented in C++, where this is specified. Won't
delete'ing stuff inside calls to new wreak havoc on most code?

Anyway, now I get "All 41 tests passed" with "gcc version 4.5.1-pre9999 20100604
(prerelease) rev. 160245 (Gentoo SVN)".

Now to find out why Chromium crashes if compiled with tcmalloc...

Original comment by zeev.tar...@gmail.com on 4 Jun 2010 at 12:56

Attachments:

GoogleCodeExporter commented 9 years ago
Interesting -- good find.  Thanks for tracking this down.

This seems like a potentially serious problem.  It's true that it's dangerous 
to 
write malloc hooks that try to do their own memory management, but I don't see 
an 
intrinsic problem with incrementing a counter or some such.  I agree it can be 
difficult to write safe handlers.  But I'm worried that their optimizations 
could 
mean the heap-checker, for instance, doesn't run at all.

It's fine to fix this one existing test-case, but I think I'd rather keep the 
code as 
it is, and come up with a solution that allows malloc-hooks to work 
successfully even 
after gcc 4.5.  Perhaps using the new flag to tell gcc that malloc isn't an 
intrinsic?  But then I guess all users would have to use this whenever they 
used 
tcmalloc, which isn't great.

I'll ask around here, but if you have any great ideas....

Original comment by csilv...@gmail.com on 4 Jun 2010 at 7:24

GoogleCodeExporter commented 9 years ago
It looks like we'll have to use -fno-builtin-malloc -- that's exactly what this 
flag is for.  It seems like every gcc 4.5 user will have to use it when 
compiling with perftools. :-(  Ah well.

Original comment by csilv...@gmail.com on 7 Jun 2010 at 11:25

GoogleCodeExporter commented 9 years ago
Can you test this workaround? -- that is, revert the 'volatile' change, and 
instead compile the test with -fno-builtin-malloc (you can do this by something 
like
   ./configure CXXFLAGS="-O2 -g -fno-builtin-malloc".
If that works, I'll make it part of the next release, and update the docs.

Original comment by csilv...@gmail.com on 9 Jun 2010 at 8:13

GoogleCodeExporter commented 9 years ago
Tests pass with -fno-builtin-malloc -fno-builtin-calloc on gcc version 
4.5.1-pre9999 20100604 (prerelease) rev. 160245 (Gentoo SVN)

Original comment by zeev.tar...@gmail.com on 10 Jun 2010 at 7:10

GoogleCodeExporter commented 9 years ago
OK, good to know.  I'll make that the fix for the next release.

Original comment by csilv...@gmail.com on 10 Jun 2010 at 4:53

GoogleCodeExporter commented 9 years ago
I'm new to this library.  Where do I put these flags?  Should they be in the 
CFLAGS environmental variable when I compile tcmalloc?  I can add them to my 
own makefiles with no worries, I just happen to know zilch about 
configure/automake/autoconf and am not sure how to modify tcmalloc's build.

Original comment by tibb...@gmail.com on 14 Jun 2010 at 11:53

GoogleCodeExporter commented 9 years ago
nevermind.  I just answered my own question - should be CPPFLAGS.  Sorry for 
the notes.

Original comment by tibb...@gmail.com on 14 Jun 2010 at 11:59

GoogleCodeExporter commented 9 years ago
This should be fixed, or at least documented better, in perftools 1.6, just 
released.

Original comment by csilv...@gmail.com on 5 Aug 2010 at 8:51