Closed GoogleCodeExporter closed 9 years ago
Hmm, I can't reproduce this. I got my hands on a gcc 4.5 SVN 20100326 build,
and
tried building perftools on my x86_64 machine (ubuntu 8.04, linux 2.6.24). All
tests
passed for me. This is against perftools svn-head rather than 1.5, but I don't
expect that to make a difference, certainly not to the extent you saw.
What happens if you run ./tcmalloc_unittest by hand (rather than as part of
make
check)? Does tcmalloc_debug_unittest also fail? -- it may give a better error
message if it does.
I'm afraid I can't really help much figuring out what's going on, given that
this
works for me. If you're able to poke around with gdb a bit and see what might
be
going wrong, that would definitely be helpful.
Original comment by csilv...@gmail.com
on 26 Apr 2010 at 7:20
Here are my config.log and test.log for version 1.5
Original comment by pipping....@gmail.com
on 26 Apr 2010 at 11:29
Attachments:
Here are my config.log and test.log for HEAD.
The additional failing test can probably be ignored.
Original comment by pipping....@gmail.com
on 26 Apr 2010 at 11:30
Attachments:
pipping@bogus ~/Downloads/google-perftools-1.5 $
./tcmalloc_minimal_debug_unittest
Testing empty allocation
Testing STL use
Sanity-testing all the memory allocation functions
Testing large allocation
Testing alignment of malloc(0)
Testing alignment of malloc(1)
Segmentation fault
pipping@bogus ~/Downloads/google-perftools-1.5 $
Original comment by pipping....@gmail.com
on 27 Apr 2010 at 12:33
This is beginning to look an awful lot like
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6031
but then not a hundred percent. One way to trigger this bug is
make distclean;
./configure CXX=g++-4.5 CXXFLAGS="-O1 -g3" --enable-frame-pointers
make -j3
./tcmalloc_minimal_unittest
one ends up with
[..]
Sanity-testing all the memory allocation functions
Testing large allocation
Testing alignment of malloc(0)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(1)
Testing alignment of malloc(2)
Testing alignment of malloc(3)
Testing alignment of malloc(3)
Testing alignment of malloc(4)
Testing alignment of malloc(5)
Testing alignment of malloc(7)
Testing alignment of malloc(8)
Testing alignment of malloc(9)
Testing alignment of malloc(15)
Testing alignment of malloc(16)
Testing alignment of malloc(17)
Testing alignment of malloc(31)
Testing alignment of malloc(32)
Segmentation fault
I tried to find out what optimization exactly is causing this. So i looked at
the
differences between -O1 and -O0:
flags=( $(diff <(g++-4.5 -c -Q -O0 --help=optimizers) <(g++-4.5 -c -Q -O1 --
help=optimizers) | sed -e '/^[^>]/d' -e 's:^> ::' -e
'/\[enabled\]/s:[[:space:]]*\[enabled\]$::' -e
'/\[disabled\]/{s:[[:space:]]*\[disabled\]$::;s:^-f:-fno-:}') )
Unfortunately,
make distclean
./configure CXX=g++-4.5 CXXFLAGS="${flags[*]} -g3" --enable-frame-pointers
make -j3
./tcmalloc_minimal_unittest
does *not* trigger the bug. So I have to assume it's one of the optimizations
-O1
enables that can't be disabled explicitly ( there's "Make -O2 (and other -On
options)
be a sum of individual -f* options, by adding a -fother-O2-optimizations (and
same
for other -On levels" here: http://gcc.gnu.org/wiki/SummerOfCode )
Original comment by pipping....@gmail.com
on 27 Apr 2010 at 1:56
Here's what happens in tcmalloc_minimal_unittest
Program received signal SIGSEGV, Segmentation fault.
tcmalloc::CentralFreeList::FetchFromSpans (this=0x7ffff7dc0720) at
src/central_freelist.cc:249
249 span->objects = *(reinterpret_cast<void**>(result));
(gdb) bt
#0 tcmalloc::CentralFreeList::FetchFromSpans (this=0x7ffff7dc0720) at
src/central_freelist.cc:249
#1 0x00007ffff7ba82fd in tcmalloc::CentralFreeList::FetchFromSpansSafe
(this=0x7ffff7dc0720)
at src/central_freelist.cc:234
#2 0x00007ffff7ba844b in tcmalloc::CentralFreeList::RemoveRange
(this=0x7ffff7dc0720, start=0x7fffffffde08,
end=0x7fffffffde00, N=28) at src/central_freelist.cc:214
#3 0x00007ffff7baae0f in tcmalloc::ThreadCache::FetchFromCentralCache
(this=<value
optimized out>, cl=3,
byte_size=32) at src/thread_cache.cc:149
#4 0x00007ffff7bae1b2 in tcmalloc::ThreadCache::Allocate (size=32) at
src/thread_cache.h:340
#5 do_malloc (size=32) at src/tcmalloc.cc:917
#6 do_malloc_or_cpp_alloc (size=32) at src/tcmalloc.cc:870
#7 tc_malloc (size=32) at src/tcmalloc.cc:1308
#8 0x00000000004021e1 in TestAlignmentForSize (size=32) at
src/tests/tcmalloc_unittest.cc:714
#9 0x0000000000403631 in TestMallocAlignment (argc=<value optimized out>,
argv=<value optimized out>)
at src/tests/tcmalloc_unittest.cc:734
#10 RunAllTests (argc=<value optimized out>, argv=<value optimized out>) at
src/tests/tcmalloc_unittest.cc:1133
#11 main (argc=<value optimized out>, argv=<value optimized out>) at
src/tests/tcmalloc_unittest.cc:1216
(gdb) p span
$1 = (tcmalloc::Span *) 0x60a060
(gdb) p *span
$2 = {start = 2186, length = 1, next = 0x60a1e0, prev = 0x7ffff7dc0760, objects
=
0x0, refcount = 3, sizeclass = 3,
location = 0, sample = 0}
(gdb) p result
$3 = (void *) 0x0
Original comment by pipping....@gmail.com
on 27 Apr 2010 at 2:05
Aha! The optimization is the key. I tried building against with gcc 4.5 with
-O2,
rather than -O0, and now I get the same failure you see. Of course, with
optimization on, it's very difficult to debug... :-/
A failure in the line you pointed out just means that memory was corrupted
somewhere.
It's pretty much impossible to guess where that might have been. I'm
sympathetic to
the idea that this is a bug in gcc, since a) it doesn't occur in earlier gcc's,
and
b) it doesn't occur even in gcc 4.5, in debug mode. This isn't conclusive, but
it
does point to the most likely problem being a bug introduced into the gcc
optimizer.
I'll try to look into this, but it's going to be challenging, especially if
it's a
compiler bug.
Original comment by csilv...@gmail.com
on 27 Apr 2010 at 4:05
The good news is that I mentioned earlier you don't need -O2. Instead, -O1 will
do.
Original comment by pipping....@gmail.com
on 27 Apr 2010 at 5:48
Alas, -O1 already has lots of variable values being optimized out, so it's hard
to
figure out what might be going on.
I've verified tcmalloc_minimal_unittest passes with -O2 in gcc 4.4.3, so it
seem to be
an issue that only comes up with 4.5. Whether it's a bug in 4.5, or a bug in
tcmalloc
that the new optimizer brings out, is still an open question.
Original comment by csilv...@gmail.com
on 27 Apr 2010 at 8:33
The attached patch makes it work for me. I'm by no means an expert C++
programmer,
but it looks like a genuine compiler bug (since what appears to me a semantic
placebo
fixes it). The code that gets miscompiled by gcc 4.5 with -O1 and up is the
last two
code lines in CentralFreeList::ReleaseToSpans, prepending the object to the
span->objects singly linked list. Found by resurrecting a modified version of
DLL_Print from span.[h|cc].
Original comment by zeev.tar...@gmail.com
on 22 May 2010 at 1:34
Attachments:
For what it's worth, gcc built from the code that's currently in the 4.5 branch
does
not exhibit this bug and correctly compiles unmodified tcmalloc code in a way
that
passes this test and the assembly looks sane. So I guess it's going to be fixed
in
gcc 4.5.1, whenever it's out.
Original comment by zeev.tar...@gmail.com
on 24 May 2010 at 8:19
Nice job figuring out a fix for this. Do you think it's worth fixing in
tcmalloc? Is
gcc 4.5 part of any major distribution, do you know? If gcc 4.5.1 will fix it,
maybe
it's just best to leave this be, without complexifying the code. On the other
hand,
we have a workaround, so maybe we should just use it. Thoughts?
Original comment by csilv...@gmail.com
on 24 May 2010 at 3:55
gcc 4.5 is in current Arch and Gentoo unstable. It will be in the next Ubuntu
and
Suse, and probably Fedora too (though they might ship binaries built with gcc
4.4.4).
The fix is simple (small, localized) and low risk (especially if that "noinline"
function attribute is inside an ifdef that's checking that the compiler is in
fact
gcc 4.5.0), but maybe unnecessary - that's up to you. You should keep the bug
open
until a stable version of gcc is released that does not exhibit this behavior,
though.
Original comment by zeev.tar...@gmail.com
on 24 May 2010 at 7:34
My testing and my report were incomplete. tcmalloc is broken with gcc 4.5
trunk, it
just fails somewhere else. The unit tests fail (at "Testing out of memory") and
chromium built with tcmalloc crashes, so there are other problems. I'll try to
find
what is miscompiled with gcc from 4.5 branch, but I can't debug gcc itself.
Original comment by zeev.tar...@gmail.com
on 4 Jun 2010 at 9:36
My analysis: The problem was in TestSetNewMode and it was that gcc optimizes
with the
knowledge that malloc cannot modify the value of g_no_memory, because malloc is
a
special library function. This knowledge is wrong because we overload malloc.
But we
don't tell gcc that its assumptions about malloc are invalid. So the test
fails: gcc
doesn't even bother to check the value in g_no_memory.
I asked on gcc's IRC channel, and was told that doing what this does is Wrong. I
agree with them: changing the semantics of malloc is a Bad Idea. It's a
reasonable
assumption that malloc does not modify any user-visible state, so gcc has a
right to
do what it did.
We can tell gcc that malloc is not really malloc
(http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Other-Builtins.html#Other-Builtins)
but
we can't do it for the other functions, and it's a bad idea anyway.
Marking the flag volatile is a work-around for the testsuite, but in real
usage, if
calling malloc can behind the scenes call a user-defined handler that calls
free() on
a bunch of stuff, it will invalidate lots of assumptions the code had (like
making
some pointers garbage, etc.). In essence, using tc_set_new_mode(1) turns every
call
to malloc into a garbage-collection checkpoint, in a language where GC does not
exist
and the compiler optimizes accordingly. A bad idea.
BTW, I don't know how this is implemented in C++, where this is specified. Won't
delete'ing stuff inside calls to new wreak havoc on most code?
Anyway, now I get "All 41 tests passed" with "gcc version 4.5.1-pre9999 20100604
(prerelease) rev. 160245 (Gentoo SVN)".
Now to find out why Chromium crashes if compiled with tcmalloc...
Original comment by zeev.tar...@gmail.com
on 4 Jun 2010 at 12:56
Attachments:
Interesting -- good find. Thanks for tracking this down.
This seems like a potentially serious problem. It's true that it's dangerous
to
write malloc hooks that try to do their own memory management, but I don't see
an
intrinsic problem with incrementing a counter or some such. I agree it can be
difficult to write safe handlers. But I'm worried that their optimizations
could
mean the heap-checker, for instance, doesn't run at all.
It's fine to fix this one existing test-case, but I think I'd rather keep the
code as
it is, and come up with a solution that allows malloc-hooks to work
successfully even
after gcc 4.5. Perhaps using the new flag to tell gcc that malloc isn't an
intrinsic? But then I guess all users would have to use this whenever they
used
tcmalloc, which isn't great.
I'll ask around here, but if you have any great ideas....
Original comment by csilv...@gmail.com
on 4 Jun 2010 at 7:24
It looks like we'll have to use -fno-builtin-malloc -- that's exactly what this
flag is for. It seems like every gcc 4.5 user will have to use it when
compiling with perftools. :-( Ah well.
Original comment by csilv...@gmail.com
on 7 Jun 2010 at 11:25
Can you test this workaround? -- that is, revert the 'volatile' change, and
instead compile the test with -fno-builtin-malloc (you can do this by something
like
./configure CXXFLAGS="-O2 -g -fno-builtin-malloc".
If that works, I'll make it part of the next release, and update the docs.
Original comment by csilv...@gmail.com
on 9 Jun 2010 at 8:13
Tests pass with -fno-builtin-malloc -fno-builtin-calloc on gcc version
4.5.1-pre9999 20100604 (prerelease) rev. 160245 (Gentoo SVN)
Original comment by zeev.tar...@gmail.com
on 10 Jun 2010 at 7:10
OK, good to know. I'll make that the fix for the next release.
Original comment by csilv...@gmail.com
on 10 Jun 2010 at 4:53
I'm new to this library. Where do I put these flags? Should they be in the
CFLAGS environmental variable when I compile tcmalloc? I can add them to my
own makefiles with no worries, I just happen to know zilch about
configure/automake/autoconf and am not sure how to modify tcmalloc's build.
Original comment by tibb...@gmail.com
on 14 Jun 2010 at 11:53
nevermind. I just answered my own question - should be CPPFLAGS. Sorry for
the notes.
Original comment by tibb...@gmail.com
on 14 Jun 2010 at 11:59
This should be fixed, or at least documented better, in perftools 1.6, just
released.
Original comment by csilv...@gmail.com
on 5 Aug 2010 at 8:51
Original issue reported on code.google.com by
pipping....@gmail.com
on 25 Apr 2010 at 11:27