Closed GoogleCodeExporter closed 9 years ago
I tried to investigate some.
The call to NXHashGet is trying to access the variable "class_hash", as seen at
http://opensource.apple.com/source/objc4/objc4-493.9/runtime/objc-runtime-old.m
. This hash table is expected to contain values of type struct old_class*,
each of which corresponds to an Objective-C class. We crash because all the
structures corresponding to classes that come out of SenTestingKit are
completely zeroed out by the time we hit main. This includes a char* class
name pointer within old_class that is being passed to strcmp. It is expected
to never be NULL.
I was able to put breakpoints on all the places where class_hash is modified in
objc-runtime-old.m ObjC and verify the integrity of class_hash. I observed
that the class structures for SenTestingKit are correctly added to class_hash,
and it was always valid. However, by the time we hit main, the structure is
corrupt. So, I think it's a module initializer or something going awry? I
tried putting a watchpoint on the class name to see when it got zeroed, but
that didn't pick it up.
The source to SenTestingKit (or some version of it) is at
https://github.com/jy/SenTestingKit .
Possibly important is that SenTestProbe has a +load method (which runs before
main, and I can see that it's running before __asan_init). This method kicks
off several +initialize methods running (the objective-C runtime runs the
+initialize method for a class the first time someone tries to message the
class). Of possibly further interest, it looks like this ends up spawning a
thread before __asan_init runs. This happens because SenTestingKit indirectly
calls code in the Apple frameworks that uses dispatch_once for one-time
initialization. This (I think) initializes libdispatch
(http://libdispatch.macosforge.org/) which spawns a worker thread pool or
something like that.
It's of particular interest to run unit tests under asan, since they exercise a
pile of code. I was able to rewrite my particular pile of tests not to use
SenTestingKit, though, and address sanitizer worked great once I removed the
linkage!
Original comment by kenfe...@gmail.com
on 13 Feb 2012 at 12:10
I can reproduce this on 10.6
If something is running before __asan_init, and that something maps memory or
spawns threads, we are doomed.
When we compile +load methods we insert __asan_init in the beginning of the
method (I hate this hack), but it doesn't help here because we don't compile
this code.
The +load methods are called from this stack (at least on my Mac box):
#2 0x90e08bb4 in call_load_methods ()
#3 0x90e0892e in load_images ()
#4 0x8fe036c8 in
__dyld__ZN4dyldL12notifySingleE17dyld_image_statesPK11ImageLoader ()
#5 0x8fe0d30a in
__dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEj ()
#6 0x8fe0d3d1 in __dyld__ZN11ImageLoader15runInitializersERKNS_11LinkContextE
()
#7 0x8fe024a9 in __dyld__ZN4dyld24initializeMainExecutableEv ()
#8 0x8fe07950 in __dyld__ZN4dyld5_mainEPK12macho_headermiPPKcS5_S5_ ()
#9 0x8fe018b1 in __dyld__ZN13dyldbootstrap5startEPK12macho_headeriPPKcl ()
#10 0x8fe01057 in __dyld__dyld_start ()
Maybe we can intercept load_images somehow at link-time?
Alex?
Ken, your suggestions on how to call __asan_init before +load methods are
welcome here.
Original comment by konstant...@gmail.com
on 13 Feb 2012 at 1:58
I don't know how to get __asan_init in cleanly, but I know a few hacks one can
try.
On trying them, though, I'm now wondering if the issue is something else in
this case.
If you build a small dylib that contains just an empty c file, it looks to me
like that is enough for asan to insert a module initializer that calls
__asan_init.
If you set the environment variable
DYLD_INSERT_LIBRARIES=/path/to/libInitializeAsan.dylib, then that library will
be loaded before those specified in the program.
However, it looks to me like that actually makes it load TOO early - I crash
like so:
#0 0x9462a818 in _CFRuntimeSetInstanceTypeID ()
#1 0x94693a46 in CFAllocatorCreate ()
#2 0x00009097 in __asan::ReplaceSystemMalloc () at
/Users/glider/src/asan/asan-llvm-trunk/llvm/projects/compiler-rt/lib/asan/asan_m
alloc_mac.cc:384
Which looks to me like CoreFoundation needs to be initialized before
__asan_init runs.
Ok, so here's another shot. If one makes the same libInitializeAsan.dylib,
links it to just CoreFoundation.framework, then has the real app link
libInitializeAsan.dylib before it links anything else, then it looks to me like
__asan_init runs right after CoreFoundation's module initializers.
Linking in SenTestingKit still crashes though.. I don't know what's special
about it. I tried to build my own version of SenTestingKit from the source at
https://github.com/jy/SenTestingKit, and it doesn't cause the same problem.
Re: Getting __asan_init in there in a controlled way, I'll ask around a little
bit, and you could also ask on the llvm list. I wonder if it would be possible
to do something with a custom __start? Or is that too early to be helpful?
man 1 dyld has some good stuff, by the way. DYLD_PRINT_APIS and
DYLD_PRINT_INITIALIZERS are useful.
Original comment by kenfe...@gmail.com
on 13 Feb 2012 at 3:49
Kostya, WDYT about linker-based hacks to initialize early?
As someone had previously suggested in macdev, we can use a postbuild script
that does something to the initialization code (e.g. swaps our ctors or +load
methods with those coming from the non-instrumented code).
Another option is to use DYLD_INSERT_LIBRARIES (that's almost the same as
LD_PRELOAD on Linux)
The third one is to ship our own dyld (that's ld.so on Mac) that initializes
ASan itself. This isn't very good since we're targeting several OS versions.
Original comment by ramosian.glider@gmail.com
on 13 Feb 2012 at 9:05
Ok, I think I see the problem. It's maybe worse than than a +load.
Using otool -l as described at
https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPFramew
orks/Concepts/FrameworkBinding.html#//apple_ref/doc/uid/20002256-107122 , it
looks like the framework was linked with -seg1addr 0x20100000. So that's its
preferred load address.
Asan uses 0x20000000 - 0x23ffffff for low shadow memory.
It _looks_ like dyld maps _all_ the linked libraries before it runs _any_
module initializers or load methods. So SenTestKit is already mapped in when
__asan_init runs, and asan zeros that portion of memory for use as the shadow.
Original comment by kenfe...@gmail.com
on 13 Feb 2012 at 9:12
FTR, if the instrumented code is somehow executed before __asan_init (e.g. from
a non-instrumented +load method), this will show up immediately. There should
be no delayed effects.
Original comment by ramosian.glider@gmail.com
on 13 Feb 2012 at 9:19
Hm, I wonder if it's possible to ask dyld not to load anything in the shadow
memory (we'll also need this for Chrome+NaCl).
Original comment by ramosian.glider@gmail.com
on 13 Feb 2012 at 9:20
Yes, I think that works! dyld maps in the executable itself first. Any other
libraries will slide if their preferred addresses are already mapped.
I passed linker flags
-segaddr AsanLowShadow 0x20000000 -sectcreate AsanLowShadow placeholder /tmp/foo
Where /tmp/foo was a large file, and that did the trick! There are probably
cleaner ways to do it.
Original comment by kenfe...@gmail.com
on 13 Feb 2012 at 10:52
This also works:
asm( ".zerofill AsanShadow , shadowsec , shadowmem , 0x1fffffff");
Perhaps this can be used instead of performing the mapping in __asan_init.
Original comment by kenfe...@gmail.com
on 13 Feb 2012 at 11:44
>> asm( ".zerofill AsanShadow , shadowsec , shadowmem , 0x1fffffff");
FYI: .zerofill will be bad for performance on 64-bits.
% cat asm.c
#include <stdio.h>
asm( ".zerofill AsanShadow, shadowsec , _asan_shadow ,0x0000100000000000");
extern char asan_shadow[0x0000100000000000ULL];
int main() {
printf("Ok %p\n", asan_shadow);
}
% clang -segaddr AsanShadow 0x100000000000 asm.c && time ./a.out
Ok 0x100000000000
real 0m0.952s
user 0m0.001s
sys 0m0.948s
This is twice slower than the current-already-slow speed (issue 24)
Original comment by konstant...@gmail.com
on 24 Feb 2012 at 11:43
Is that?
$ clang -segaddr AsanShadow 0x100000000000 asm.c -g && time ./a.out
Ok 0x100000000000
real 0m0.685s
user 0m0.001s
sys 0m0.627s
(this is the same machine I used to profile issue 24)
Also some data from Shark:
================================================================================
===============
PID #37556 [37556]
% Total Symbol Library Address Length
--------------------------------------------------------------------------------
---------------
46.2% pmap_remove /mach_kernel 0x2953eb 0x201
26.4% vm_map_lookup_entry /mach_kernel 0x25ca35 0xde
9.6% pmap64_pde /mach_kernel 0x28ec0f 0x1df
9.3% pmap64_pdpt /mach_kernel 0x28ea39 0x1d6
2.9% lck_mtx_lock_spin /mach_kernel 0x29a3b0 0x100
1.7% vm_map_lookup_locked /mach_kernel 0x260066 0x1a4e
1.3% pmap_pde /mach_kernel 0x28edee 0x2b
1.0% lck_mtx_unlock_darwin10 /mach_kernel 0x29a960 0x1b0
0.4% zfree /mach_kernel 0x234386 0x276
0.4% hw_lock_unlock /mach_kernel 0x299db0 0x40
0.3% hw_lock_to /mach_kernel 0x299d30 0x80
0.1% vm_page_lookup /mach_kernel 0x279df0 0x139
0.1% vm_map_simplify_entry /mach_kernel 0x25f54b 0xa54
0.1% lck_rw_done_gen /mach_kernel 0x29fe60 0xee
0.1% lck_rw_done /mach_kernel 0x29a240 0x80
I see no performance difference here.
Original comment by ramosian.glider@gmail.com
on 27 Feb 2012 at 8:05
I would not expect to see a perf hit from zero-filling, because all pages
brought in from vm_allocate'd memory are zero filled for security reasons. If
it's _eagerly_ zero-filled, that would be a problem.
Original comment by kenfe...@gmail.com
on 28 Feb 2012 at 4:56
Indeed, I remeasured and I see that asm .zerofill is as fast (or, I should say,
as slow) as manual mmap.
Perhaps asm.zerofill + -segaddr is more bullet-proof because it happens at
link-time.
It will also allow us to get rid of instrumentation hacks for "+ methods".
WDT?
Is it portable across all variants of MacOS?
I wonder if there is anything like this for linux...
Original comment by konstant...@gmail.com
on 6 Mar 2012 at 9:45
The following seems to work.
shadow.s:
.section .asan.shadow,"aw",@nobits
.globl shadow
.type shadow, @object
.size shadow, 0x20000000
shadow:
.zero 0x20000000
1.c:
#include <stdio.h>
extern char shadow[];
int main(void) { printf("shadow at %p\n", shadow); return 0; }
gcc 1.c shadow.s -o 1 -Wl,--section-start=.asan.shadow=0x2000000
I could not make it work with a 64-bit-sized shadow (with -mcmodel=large):
Starting program: /usr/local/google/home/eugenis/test/zerofill/1
During startup program terminated with signal SIGKILL, Killed.
Also, linker (ld) appears to allocate the whole shadow in memory, even though
the final ELF and all intermediate objects are tiny. Gold crashes on an
assertion when a section is larger than a certain threshold.
What's more important, this approach does not help us intercept early memory
allocations. It also does not work for PIE (but this is not really an issue).
Original comment by euge...@google.com
on 7 Mar 2012 at 8:55
mmap(0x2000000000, 137438953472, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
So, ld.so does not add MAP_NORESERVE, and, apparently, there is no way to make
it add it, other than patching the loader itself.
Original comment by euge...@google.com
on 7 Mar 2012 at 12:38
Original comment by konstant...@gmail.com
on 22 May 2012 at 8:48
are we going to take any action on this?
If not, let's close this issue.
Original comment by konstant...@gmail.com
on 18 Feb 2013 at 8:17
Ken, can you please check whether the problem is reproduced with the trunk
Clang?
We've made major changes in ASan recently, so the behavior could've changed
significantly.
Original comment by ramosian.glider@gmail.com
on 26 Feb 2013 at 4:52
Original comment by ramosian.glider@gmail.com
on 7 May 2014 at 9:02
Adding Project:AddressSanitizer as part of GitHub migration.
Original comment by ramosian.glider@gmail.com
on 30 Jul 2015 at 9:12
Original issue reported on code.google.com by
kenfe...@gmail.com
on 12 Feb 2012 at 11:50Attachments: