Closed GoogleCodeExporter closed 9 years ago
Wow, that's an interesting dissassembly output. I'm surprised
attachMethodLists is calling malloc_size().
The source for attachMethodLists seems to be available:
http://www.opensource.apple.com/source/objc4/objc4-493.9/runtime/objc-runtime-new.mm?txt
I don't see any call to realloc in there, much less malloc_size(). Can you
point out the source code that's being executed here? I'd like to understand
better what that code is doing.
Original comment by csilv...@gmail.com
on 28 Oct 2011 at 1:11
Phew--thought I was going crazy there for a second! You linked to the latest
version that ships with 10.7.2. Here's the version for 10.6.8:
http://www.opensource.apple.com/source/objc4/objc4-437.3/runtime/objc-runtime-ne
w.m
And the happy block of code that's doing all the evil stuff:
static void
attachMethodLists(class_t *cls, method_list_t **lists, int count,
BOOL methodsFromBundle, BOOL *outVtablesAffected)
{
rwlock_assert_writing(&runtimeLock);
BOOL vtablesAffected = NO;
size_t listsSize = count * sizeof(*lists);
// Create or extend method list array
// Leave `count` empty slots at the start of the array to be filled below.
if (!cls->data->methods) {
// no bonus method lists yet
cls->data->methods = _calloc_internal(1 + count, sizeof(*lists));
} else {
size_t oldSize = malloc_size(cls->data->methods);
cls->data->methods =
_realloc_internal(cls->data->methods, oldSize + listsSize);
memmove(cls->data->methods + count, cls->data->methods, oldSize);
}
Original comment by mill...@gmail.com
on 28 Oct 2011 at 1:33
Does this mean we can solve the problem by just telling people to upgrade? :-)
Yeah, this is just buggy code. I'm glad the updated objc fixes it.
The question is what to do for this old code. I don't want to revert the
realloc behavior, because it really helps programs that realloc a lot, like
python. (They're arguably still poorly written, but not as much as this.) I
don't want to have malloc_size() lie either -- that's just punishing apps that
do the right thing.
I guess we'll be stuck with 10.6.x (and below) for a while, so it's worth
coming up with a fix for this. I don't want to punish python users using
tcmalloc on windows, though, so I hate to change the realloc behavior, even
with a typedef. I guess we could lie about malloc_size() only for 10.6.x and
below, though that makes me very sad. Any other ideas?
Original comment by csilv...@gmail.com
on 28 Oct 2011 at 1:54
It's a tough question, I think. I'm not sure the solution to Issue 52 was
necessarily ideal because it seems like an optimization for some poorly written
code in the Python implementation. One thing might be to take a cue from
Apple's szone. If szone_realloc has to actually allocate a new block, it tries
a deferred copy-on-read using vm_copy. You'd have to take out the aggressive
growth on realloc for that to be a fix for both issues, which would probably
cause Python to slow down. It might be slightly better than just doing the
memcpy though.
You could put a note in the docs and add a configure option. But that just
pushes the burden onto the user in a bad way.
The compromise solution would be to allow it to grow aggressively at first but
put a cap on it so it doesn't get unreasonable hugely. But there might be
legitimate use cases for very large sizes and I don't know what Python's
realloc strategy is and how big those objects are getting.
I think I'm leaning towards lying in malloc_size if running on any system
before 10.7. I have trouble thinking of places where it would cause actual
problems. You could imagine someone implementing a custom realloc-type
function as I mentioned that checks malloc_size to see if there's enough space.
For that use-case, there will be a performance hit because they'll be lied to
about how much memory they actually had and have to malloc and free
"erroneously". But that seems like an artificial example from my perspective.
Normally, those unassigned region at the end is pretty small, right? In the
case of tcmalloc's realloc, then it can get significant. But if they're using
realloc, why are they asking for malloc_size? Besides profiling and debugging,
what are they actually going to do with that information if not realloc? I
have trouble thinking up an answer to that... maybe you have some other
hypothetical situations in which this would cause problems. I agree it's
inelegant and incorrect, but it seems fairly harmless.
For good measure, you could make sure that the tcmalloc versions of the call
return the correct result, and just lie for the malloc_zone size function.
Original comment by mill...@gmail.com
on 28 Oct 2011 at 2:42
OK, let's go with the malloc_size hack. I agree that we should only adjust the
malloc-zone malloc_size, and not tc_malloc_size.
Would you like to draw up a patch?
Original comment by csilv...@gmail.com
on 28 Oct 2011 at 6:20
Yeah, I'm working on it right now. As a heads up, I naively assumed there was
already mechanism in place to get the requested size from a malloc'ed piece of
memory. I can't seem to find any such functionality. I'm going to maintain
that information by sticking a size_t at the back of the allocated block. This
change might cause some blocks to allocate slightly larger than before, since
we need to have space for that size_t at the end. Is that okay? Can you think
of a better way to do that? This also means the code leaks out of
libc_override_osx.h a bit into tcmalloc.cc:GetSizeWithCallback, which has to
report (allocatedSize - sizeof(size_t)) so people don't inadvertently think
they can write over that space.
Additionally, it seems this bug with the Objective-C runtime is even more
confined--it's only on 64-bit applications under 10.5 and 10.6. 10.5 and 10.6
only use the modern runtime (objc-runtime-new.m) for 64-bit applications, and
use the legacy runtime (objc-runtime-old.m) in 32-bit. See
http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/ObjCRunti
meGuide/Articles/ocrtVersionsPlatforms.html for more information.
Original comment by mill...@gmail.com
on 28 Oct 2011 at 5:38
No, I afraid it won't work to store the requested size in the allocated block.
To work around a bug in a client library like this, we'd want to make a really
localized change that doesn't not affect performance or behavior of anyone else.
} Additionally, it seems this bug with the Objective-C runtime is even more
confined
I'm wondering if we should just document it and leave it be, then.
Original comment by csilv...@gmail.com
on 28 Oct 2011 at 6:59
That might be best, though I must say that I would vote for the default
behavior on the affected platforms to be that realloc doesn't grow
aggressively. szone_realloc does nothing particularly fancy besides that
vm_copy so I doubt you'd see a speed penalty in Python. Has anyone measured
the performance relative to szone on OS X? It seemed like Issue 52 was a Linux
thing.
The downside of prioritizing the Issue 52 fix first in this case is that anyone
hoping to use tcmalloc in a Cocoa application would have to modify it before
it's usable on the problem platforms.
Original comment by mill...@gmail.com
on 28 Oct 2011 at 7:30
Do you want to test it? You could try running the python script in issue 52,
with a DYLD_PRELOAD (or however one does it on os x) of tcmalloc, and compare
the speed with the realloc the way it is now, and back to a 'non-agressive'
version. If the non-agressive version works fine on os x, we could maybe put
an #ifdef in there.
Original comment by csilv...@gmail.com
on 28 Oct 2011 at 10:02
I guess the older, broken code is getting older all the time. I'll leave this
bug open but lower its priority. I still don't know what the best fix is, but
I'm handing over maintainership of perftools, so I'm happy to say it is not
Someone Else's Problem. :-)
Original comment by csilv...@gmail.com
on 25 Jan 2012 at 11:45
Just checking if there has been any further progress here. I am actively doing
patch work for the next few days and would like to clear up as many issues as
possible.
Original comment by chapp...@gmail.com
on 21 Apr 2012 at 6:16
Any update here?
Original comment by chapp...@gmail.com
on 11 Mar 2013 at 2:26
Sorry for not responding earlier--I must have missed the email from last year.
We ended up not using tcmalloc in our project so I completely forgot about it.
While we were using it, I just had a workaround to ease up on the reallocation
strategy and that solved the problem. I never got around to testing it against
Python, so I'm not sure it's fit for inclusion.
Original comment by mill...@gmail.com
on 11 Mar 2013 at 12:28
Closing as too old and irrelevant.
Original comment by alkondratenko
on 30 Aug 2013 at 12:04
Original issue reported on code.google.com by
mill...@gmail.com
on 28 Oct 2011 at 1:00