Open GoogleCodeExporter opened 9 years ago
The reason for which we're having MaybeReexec() on Mac is that even when the
instrumented main executable depends on the shared ASan runtime library, we
have to preload that runtime library into the executable in order for the
interceptors to work correctly (this is Mac-specific). We try our best to do
re-execution very early on Mac, earlier than the program starts doing anything.
If only a single shared library in the testing environment is instrumented (and
depends on the ASan runtime library), and the main executable is not,
__asan_init() is going to be called only at the moment that library is loaded
(including the case when we use dlopen() to load it). At that moment the main
executable might have done a fair amount of work which we can't simply replay
upon reexec().
I believe that unless the main executable depends on the ASan shared runtime
the users must explicitly preload the runtime in order to test any pieces of
code that might initialize late.
Original comment by ramosian.glider@gmail.com
on 27 Oct 2014 at 7:10
> If only a single shared library in the testing environment is instrumented
(and
> depends on the ASan runtime library), and the main executable is not,
> __asan_init() is going to be called only at the moment that library is loaded
> (including the case when we use dlopen() to load it).
Right, explicitly banning the dlopen case would be nice but I'm not sure how to
achieve this.
> At that moment the main
> executable might have done a fair amount of work which we can't simply replay
> upon reexec().
If main executable depends on the library (which is really the case we are
interested in) then worst-case some library initializers might have been
executed.
> I believe that unless the main executable depends on the ASan shared runtime
the > users must explicitly preload the runtime in order to test any pieces of
code
> that might initialize late.
This may get hard to do for some systems. Finding the exact place where a
particular executable(s) depending on a library in a large autobuilt
distribution is challenging.
Original comment by tetra2...@gmail.com
on 27 Oct 2014 at 9:44
> Right, explicitly banning the dlopen case would be nice but I'm not sure how
to achieve this.
Why ban this case? Doesn't it work with LD_PRELOAD?
> If main executable depends on the library (which is really the case we are
interested in) then worst-case some library initializers might have been
executed.
Can you please remind why GCC doesn't use the static runtime library?
Original comment by ramosian.glider@gmail.com
on 27 Oct 2014 at 9:50
> Why ban this case? Doesn't it work with LD_PRELOAD?
No, dlopen may be executed in the middle of a working program when some files
already got written so reexecution would change the semantics unpredictably.
> Can you please remind why GCC doesn't use the static runtime library?
Well, both GCC and Clang support both static and dynamic runtimes, it's just
the default choice in GCC is different (for historical reasons). One good thing
about AsanDSO is that it allows running sanitized .so with unsanitized
executables.
Original comment by tetra2...@gmail.com
on 27 Oct 2014 at 9:55
> No, dlopen may be executed in the middle of a working program when some files
already got written so reexecution would change the semantics unpredictably.
I mean, in the current setup preloading the library lets you test both
instrumented executables, and instrumented libraries with uninstrumented
executables. Re-exec works only for the former case, but that does not mean we
should ban the latter one just to make re-exec work (if I'm understanding
correctly what you want).
Original comment by ramosian.glider@gmail.com
on 27 Oct 2014 at 10:16
> I mean, in the current setup preloading the library
> lets you test both instrumented executables
> Re-exec works only for the former case,
> but that does not mean we should ban the latter one
> just to make re-exec work (if I'm understanding correctly what you want).
Ah, sure, manual LD_PRELOAD would work in this case. I just meant that
ASAN_OPTIONS=maybe_reexec=1 wouldn't.
Original comment by tetra2...@gmail.com
on 27 Oct 2014 at 10:18
Here's a link to original discussion of reexec porting:
https://groups.google.com/forum/#!searchin/address-sanitizer/reexec/address-sani
tizer/Xav2pArPJ3E/tXZRsX6S7LoJ
Original comment by tetra20...@gmail.com
on 28 Oct 2014 at 8:29
I hate the idea of reexec (even though we have it for other use cases).
This is too fragile and too complex.
Maybe you can get away with manual LD_PRELOAD and un-setting LD_PRELOAD for
children?
Original comment by konstant...@gmail.com
on 31 Oct 2014 at 10:36
> I hate the idea of reexec (even though we have it for other use cases).
> This is too fragile and too complex.
It surely is. On the other hand it improves usability in some very common
situation (instrument parts of large distribution) and we already have it on
other platforms.
> Maybe you can get away with manual LD_PRELOAD and un-setting LD_PRELOAD for
children?
In many cases that would mean modifying source code to set/unset LD_PRELOAD
which would be a big blocker.
Original comment by tetra2...@gmail.com
on 1 Nov 2014 at 5:21
Maybe we should try the same hack as we do on Android?
Everything is runing with asan, but in inactivated mode.
As soon as some module calls __asan_init we activate asan.
Original comment by konstant...@gmail.com
on 5 Nov 2014 at 1:20
> Maybe we should try the same hack as we do on Android?
> Everything is runing with asan, but in inactivated mode.
> As soon as some module calls __asan_init we activate asan.
Konstantin,
If I understood it correctly, in your proposal Asan will be activated for any
module as soon as allocation is done or some intercepted function is called.
Indeed
very soon for all active processes. That's not what we desired. The intention
was to minimize overhead by preloading asan rt only for needed executables.
Can you specify what exactly you don't like in reexec approach? It's fair that
in dl_open case we can't rely on it. So probably we need to handle such case
separately. But in case of run-time init the executable is not started yet and
reexec shouldn't be an issue.
Original comment by mguse...@gmail.com
on 5 Nov 2014 at 12:59
> It's fair that in dl_open case we can't rely on it.
> So probably we need to handle such case separately.
We could unwind stack and check for dlopen. Or just intercept dlopen.
Original comment by tetra20...@gmail.com
on 5 Nov 2014 at 1:06
mguseva2: please see how ASAN_OPTIONS=start_deactivated=1 works
(e.g. in test/asan/TestCases/Posix/start-deactivated.cc)
asan will get activated only once an instrumented module is loaded,
i.e. a binary that does not have asan instrumentation will not activate asan.
This *may* be the solution you need for your use case.
Original comment by konstant...@gmail.com
on 5 Nov 2014 at 6:58
Thank you, Konstantin, I see. Currently on Linux the libasan.so calls
__asan_init itself but it seems to be redundant and must be fine to change it
to internal init without activation. So we can try deactivated approach. We
still need to check the overhead it will produce because of interceptors and
heap redzones.
Original comment by mguse...@gmail.com
on 7 Nov 2014 at 12:18
Redzone size is zero prior to activation.
Interceptors are supposed to have very low overhead (and they should not do any
poisoning/unpoisoning while deactivated).
Original comment by euge...@google.com
on 7 Nov 2014 at 12:21
After some modification the start_deactivated flag works in our case. I've
submitted changes I applied for review http://reviews.llvm.org/D6265.
Regarding redzones - as I see in asan_rtl.cc and asan_activation.cc redzone and
max_redzone values are set to 16 in deactivated mode. Does it mean there are
still small redzones allocated for heap memory?
I still wonder about Reexec. In current design MaybeReexec part of Asan runtime
is not Mac-specific. But it is implemented only for Mac. I think ReExec maybe
useful feature on Linux as well if we fix dlopen case. What do you think?
Original comment by mguse...@gmail.com
on 14 Nov 2014 at 10:56
As for the redzones, I think 16 bytes is the minimum, as they are used to store
some meta information about the allocation.
Original comment by euge...@google.com
on 14 Nov 2014 at 1:35
Ping.
> I still wonder about Reexec. In current design MaybeReexec part of Asan
runtime is not Mac-specific. But it is implemented only for Mac. I think ReExec
maybe useful feature on Linux as well if we fix dlopen case. What do you think?
So at least it's strange MaybeReexec is designed as common code while it's
really used only on Mac.
Original comment by mguse...@gmail.com
on 4 Dec 2014 at 9:05
Original issue reported on code.google.com by
mguse...@gmail.com
on 6 Aug 2014 at 11:34