Closed GoogleCodeExporter closed 8 years ago
Comparing
http://www.opensource.apple.com/source/Libc/Libc-391.2.10/include/malloc/malloc.
h and
http://www.opensource.apple.com/source/Libc/Libc-763.12/include/malloc/malloc.h
, we should be able to use the _leopard_malloc_zone for 10.4 as well.
Original comment by classi...@floodgap.com
on 13 Apr 2013 at 12:15
Original comment by classi...@floodgap.com
on 13 Apr 2013 at 12:16
Tobias, how are you enabling mozalloc? The changesets show your change to
jemalloc.c, but that doesn't even get compiled here, and I can't see where
you're twiddling MOZ_MEMORY (unless you're changing it through some other
means).
Original comment by classi...@floodgap.com
on 15 Apr 2013 at 2:44
(and what about https://bugzilla.mozilla.org/show_bug.cgi?id=702250 ?)
Original comment by classi...@floodgap.com
on 15 Apr 2013 at 2:46
Fixed MOZ_MEMORY and short-circuited memalign which doesn't exist on 10.4
(instead it uses the built-in ipalloc). Confirmed osx_use_jemalloc is true by
writing in if (osx_use_jemalloc) __asm__("trap\n") and verifying JS traps there
(it does). V8 passes with jemalloc. Doesn't make a lot of difference there, but
we'll build the full browser and then see if I can trigger M702250.
Original comment by classi...@floodgap.com
on 15 Apr 2013 at 3:58
Also had to make a tweak in storage/ because 10.4's zones don't have
malloc_good_zone in them (changed it to simply use 16-byte-aligned stores).
Incredibly the browser not only works, but is (especially for DEBUG) very
quick. Many of the sites in issue 193 are better, implying that resource
starvation may be part of the reason for all those threads sitting around in
wait.
However, drag and drop (thus can't test M702250), cut and paste, and printing
are all messed up. Printing may be revenge of issue 82. I'm trying to figure
out what's wrong with the other two. Running a non-jemalloc build tonight to
try to exonerate it as the cause. I suspected M647216 but trying to work around
it just made things worse.
Original comment by classi...@floodgap.com
on 16 Apr 2013 at 5:19
Printing appears to be issue 82, but cut and paste and drag and drop all work
in a non-jemalloc build. Even if osx_use_jemalloc is set to false (using the
environment variable NO_MAC_JEMALLOC), it does not work. I'm suspicious
alignment is to blame, but IonMonkey has higher priority right now. Dropping
milestone, though we do want this.
Original comment by classi...@floodgap.com
on 16 Apr 2013 at 1:39
https://bugzilla.mozilla.org/show_bug.cgi?id=691003 makes me think our minimum
size should be 16 bytes, always, even for tiny allocations.
Original comment by classi...@floodgap.com
on 16 Apr 2013 at 10:59
... i.e. change #define TINY_MIN_2POW (sizeof(void*) == 8 ? 3 : 2) to
4 (forcing 2^4 minimum allocations).
Original comment by classi...@floodgap.com
on 16 Apr 2013 at 11:05
That didn't fix it.
Original comment by classi...@floodgap.com
on 18 Apr 2013 at 1:23
I was able to proove that AuroraFox uses jemalloc by setting breakpoints in
some of the functions of jemalloc. The patches in the Changesets are exactly
what is needed to get it working (actually it wasn't correctly enabled for some
versions but that was fixed) - BUT at least at one moment I could reproducibly
crash the browser using drag and drop, however it occurs very rarely.
Original comment by Tobias.N...@gmail.com
on 19 Apr 2013 at 1:51
But how did you set MOZ_MEMORY to 1? It's not in your Aurora changesets.
Original comment by classi...@floodgap.com
on 19 Apr 2013 at 4:28
Adding "--enable-jemalloc" to the configure command line should override the
default.
Original comment by Tobias.N...@gmail.com
on 21 Apr 2013 at 12:39
See http://hg.mozilla.org/mozilla-central/file/4420f27742c7/configure.in#l7112
Original comment by Tobias.N...@gmail.com
on 21 Apr 2013 at 12:39
OK.
Looking through jemalloc.c, it appears that the original szone is still able to
be called for certain things. Maybe that's what we need to add.
Original comment by classi...@floodgap.com
on 22 Apr 2013 at 4:50
I'm thinking we have to fake up a memalign for the original zone, which 10.4
does not have. Something like this might work:
void *aligned_malloc( size_t size, int align )
{
void *mem = (szone->malloc)( size + (align-1) + sizeof(void*) );
char *amem = ((char*)mem) + sizeof(void*);
amem += align - ((uintptr)amem & (align - 1));
/* ((void**)amem)[-1] = mem; // for an aligned free */
return amem;
}
The trick is making sure we don't leak for alignments != 16. I'm throwing a
trap into jemalloc and we'll see if it bombs at memalign. If so, I bet that's
our problem.
Original comment by classi...@floodgap.com
on 15 Jul 2013 at 5:26
Status update:
- Typing this in a jemalloc debug build of "22.1." Browser speed did increase
in the way expected, as it did before, but note I am doing this with issue 231
also applied. I need to see if I can get rid of that.
- I tinkered with the jemalloc patch for 10.4 and simply made memalign a no-op
with a trap if osx_use_jemalloc id false, but the trap isn't getting triggered,
so it may never have been called in the first place.
- I'm not sure what changed in widget between my last cut and 22, but cut and
paste and drag *to* the browser work now. However, dragging things *out* of the
browser causes M702250. That's bad. OTOH, it appears that it may be possible to
get around it with something along the lines of M703087
(https://bug703087.bugzilla.mozilla.org/attachment.cgi?id=575304) which may not
be good, but may make a hopefully infrequent operation possible. Or, we just
disable drag and drop of images *from* the browser (copying or
right-click-and-save do work).
Original comment by classi...@floodgap.com
on 18 Jul 2013 at 12:20
Large allocations are also very slow. On www.butterflylabs.com,
(gdb) bt 15
#0 0xffff9080 in ___memset_pattern () at
/System/Library/Frameworks/System.framework/PrivateHeaders/ppc/cpu_capabilities.
h:193
#1 0x901296d0 in memset ()
#2 0x0002da34 in ozone_free (zone=0x3e100, ptr=0x609e00) at
/Volumes/BruceDeuce/src/mozilla-22.1/memory/mozjemalloc/jemalloc.c:6612
#3 0x08da88bc in _cairo_image_surface_finish (abstract_surface=0x31a1fbc0) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-image-surface.c:7
34
#4 0x08dca7c0 in _moz_cairo_surface_finish (surface=0x31a1fbc0) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-surface.c:728
#5 0x08dca910 in _moz_cairo_surface_destroy (surface=0x31a1fbc0) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-surface.c:649
#6 0x08ddb73c in _cairo_quartz_surface_finish (abstract_surface=0x338e83a0) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-quartz-surface.c:
2053
#7 0x08dca7c0 in _moz_cairo_surface_finish (surface=0x338e83a0) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-surface.c:728
#8 0x08dca910 in _moz_cairo_surface_destroy (surface=0x338e83a0) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-surface.c:649
#9 0x08dbc2bc in _moz_cairo_pattern_destroy (pattern=0x321b5f40) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-pattern.c:346
#10 0x08da3a68 in _cairo_gstate_fini (gstate=0x1ccf996c) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-gstate.c:229
#11 0x08da3cec in _cairo_gstate_restore (gstate=<value temporarily unavailable,
due to optimizations>, freelist=0x1ccf9ab8) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo-gstate.c:290
#12 0x08d8fe94 in _moz_cairo_restore (cr=0x1ccf9800) at
/Volumes/BruceDeuce/src/mozilla-22.1/gfx/cairo/cairo/src/cairo.c:599
#13 0x06994e60 in mozilla::FrameLayerBuilder::DrawThebesLayer
(aLayer=0x31160400, aContext=0x321b9700, aRegionToDraw=<value temporarily
unavailable, due to optimizations>, aRegionToInvalidate=@0xefffae6c,
aCallbackData=0xefffc928) at
/Volumes/BruceDeuce/src/mozilla-22.1/layout/base/FrameLayerBuilder.cpp:3334
#14 0x08c6ff30 in mozilla::layers::BasicShadowableThebesLayer::PaintBuffer
(this=0x31160400, aContext=<value temporarily unavailable, due to
optimizations>, aRegionToDraw=@0xefffae3c, aExtendedRegionToDraw=@0xefffadd8,
aRegionToInvalidate=@0xefffae6c, aDidSelfCopy=false, aCallback=<value
temporarily unavailable, due to optimizations>, aCallbackData=0xefffc928) at
basic/BasicThebesLayer.h:95
(More stack frames follow...)
I suspect we should turn MALLOC_FILL off, or only enable it for debugging.
Original comment by classi...@floodgap.com
on 18 Jul 2013 at 12:41
Disabling MALLOC_FILL fixes that problem.
Original comment by classi...@floodgap.com
on 18 Jul 2013 at 2:28
Dragging out of the browser does work for text, just not images.
Original comment by classi...@floodgap.com
on 18 Jul 2013 at 2:39
After playing around in jemalloc for awhile it appears that there is no good
way to work around the image drag bug, so it will have to be wallpapered.
In content/base/src/nsContentAreaDragDrop.cpp, disabling creating the native
image draggable still crashed.
Disabling creating the native image draggable *and* disabling creating the file
promise also still crashed.
So, I stopped those from even being made by just returning NS_OK when a drag
gesture is detected over an image. Now it no longer crashes, but other drags
like URLs and so on still work.
Nicely still, on all the images in M702250 copying them to the clipboard still
worked, as well as saving them to disk, just not dragging them. I consider this
a shippable compromise.
Original comment by classi...@floodgap.com
on 18 Jul 2013 at 10:54
New crash in jemalloc: the Applications tab in Preferences crashes the browser
with a null pointer. I need to build a DEBUG for a better stack trace, but
here's the first part of it in an opt build. malloc* in the backtrace is scary.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x000254e4 in receive_samples ()
(gdb) bt
#0 0x000254e4 in receive_samples ()
#1 0x90050bbc in malloc_zone_from_ptr ()
#2 0x90050bbc in malloc_zone_from_ptr ()
#3 0x937f4310 in -[NSBitmapImageRep _freeImage] ()
#4 0x937f3f50 in -[NSBitmapImageRep dealloc] ()
#5 0x937f3c40 in -[NSImage _freeRepresentation:] ()
#6 0x937f3a68 in -[NSImage dealloc] ()
#7 0x92bd2c68 in NSPopAutoreleasePool ()
Original comment by classi...@floodgap.com
on 20 Jul 2013 at 1:38
Seeing that it involves discarding images, maybe it has to do with the icons.
We can dispense with those if we have to, but we can't disable the Applications
tab functionality entirely.
Original comment by classi...@floodgap.com
on 20 Jul 2013 at 1:39
It's actually part of a general problem with icons -- if you pull up Downloads
and start scrolling, it will crash as icons get freed. It's the same underlying
bug as M702250 again, but it's just really bad and consistent on 10.4.
The current wallpaper is to make nsIconChannelCocoa.mm return NS_ERROR_FAILURE
when queried for an icon; icons appear too frequently to force them to leak, or
memory usage gets bad. This is okay for a beta, it's not okay for release.
However, the way around that is to make a new icon service that hands prefab
icons over from chrome (by just redirecting through ioService). That's pretty
easy to write, just a little tedious, and we'd have to add a decent library of
icons at 16x16 and 32x32 sizes.
Original comment by classi...@floodgap.com
on 20 Jul 2013 at 3:53
(by icons I mean icons from OS X -- icons internal to TenFourFox, such as
favicons and chrome, work fine)
Original comment by classi...@floodgap.com
on 20 Jul 2013 at 3:55
New showstopper: uploads work inconsistently. I'm able to upload to ADN, but
not to Google Code. Google Code works fine in 22.0.
I'm getting concerned there are all kinds of bugs in here like that.
Original comment by classi...@floodgap.com
on 21 Jul 2013 at 3:12
I can't reproduce this reliably anymore. I give up. Let's see what the punters
think.
Original comment by classi...@floodgap.com
on 21 Jul 2013 at 6:12
Viewing isolated images is slow to close. I bet the background texture is the
issue. Never liked it anyway, so we could jettison that.
Original comment by classi...@floodgap.com
on 21 Jul 2013 at 8:58
We've got a serious leak somewhere. It manifests in the base browser too after
a couple days of uptime, but is nowhere near as severe, and the browser grinds
to a halt in GC. The base non-jemalloc build does this slightly, but it still
runs fine. This unfortunately *is* a show stopper.
Original comment by classi...@floodgap.com
on 26 Jul 2013 at 4:05
Testing 24 shows this no longer pays off, and the browser's memory use still
skyrockets. Dropping priority.
Original comment by classi...@floodgap.com
on 29 Oct 2013 at 4:57
Just found out something quite important about the virtual memory management function madvise that is used in memory allocators like jemalloc to indicate the system whether memory can safely be paged out or not. Since 10.3 there has been the option MADV_FREE defined in the mman.h system header which supposedly indicates the kernel that a virtual memory region is not in use any longer and hence can be paged out immediately - but it wasn't implemented in the kernel until 10.6 (see http://fxr.watson.org/fxr/trackident?i=MADV_FREE in order to verify the use of MADV_FREE in different xnu versions). Until 10.5 this option does simply nothing but return an error plus the waste of time of performing a syscall - as a consequence the kernel will have to trust it's own statistics about what memory to page out first, and building up meaningful statistics takes time - and can definitley cause a long running browser instance to crawl (see https://bugzilla.mozilla.org/show_bug.cgi?id=736074). In order to help the kernel to quickly know what to page out first there's really only MADV_DONTNEED implemented up to including 10.5 . I found this because in WebKit 601 TCMalloc was replaced by bmalloc (needing a C++11 compiler and runtime) and the browser started to hang in madvise calls quite soon after launch - I don't know whether using MADV_DONTNEED instead of MADV_FREE is the solution but at least the syscall does succeed now (feels much better that way). Since the JavaScriptCore garbage collector now bases on bmalloc as well that thing has to work really well or whole WebKit becomes pretty much unusable.
Interesting. Something to consider for the future if I resurrect this again.
While the unimplemented MADV_FREE is interesting I found that the cause for bmalloc spending much time in madvise() was in fact a non working custom mutex built around std::atomic_flag from libstdc++. I'm pretty sure that rather simple mutex class is properly implemented, so maybe std::atomic_flag is not working correctly (the mutex is needed because freeing memory respectively marking memory as MADV_DONTNEED, for example for JavaScript garbage collection, happens in a separate thread). By disabling bmalloc and using system malloc instead, the hangs are now gone. And as WebKit uses memalign() I implemented a wrapper around malloc() that returns aligned adresses (wasting space of course). Since WebKit uses a seperate function for freeing aligned memory there's no problem in storing the original adress returned by malloc() in the extra space allocated for being able to do the alignment, so it shouldn't be leak memory. But that non-working mutex seems alarming to me.
Actually neither MADV_DONTNEED nor MADV_WILLNEED are implemented; they do the same thing as MADV_NORMAL which is the default. So calling madvise is in fact just just a waste of time on OS X version earlier than 10.6 .
For 45 performance is sufficiently satisfactory that any gains from this would be outweighed by the stuff I know would break, so wontfix.
Original issue reported on code.google.com by
classi...@floodgap.com
on 13 Apr 2013 at 12:15