Open GoogleCodeExporter opened 9 years ago
<historical note>
In the early versions of asan we called __asan_register_global on every
instrumented global separately, thus there was no need to put all the globals
into an array, and thus this problem did not exist. We were forced to replace N
calls to __asan_register_global(g) with a single call to
__asan_register_globals(g, N) for compile- and run- time performance reasons.
</historical note>
Original comment by konstant...@gmail.com
on 4 Feb 2014 at 11:26
For the proper test we also need to use -fdata-sections, otherwise
the following test will not link even w/o asan:
% cat sec.cc
int undefined();
int defined() { return 1; }
void *AAA = (void*)&defined;
void *BBB = (void*)&undefined;
int main() {
return AAA != 0;
}
% clang++ sec.cc -Wl,--gc-sections -ffunction-sections
/tmp/sec-e95f7f.o:(.data+0x8): undefined reference to `undefined()'
clang-3.5: error: linker command failed with exit code 1 (use -v to see
invocation)
% clang++ sec.cc -Wl,--gc-sections -ffunction-sections -fdata-sections
%
Original comment by konstant...@gmail.com
on 4 Feb 2014 at 11:38
Looks like the linkers on Linux and OSX are clever enough to emit the symbols
for the start and end of a certain section (see the attached example). We can
make the compiler put per-variable global descriptors into a special data
section and iterate over it using these two symbols. This shall allow the
linker discard the unused globals since they won't be transitively referenced
by the global constructors array.
Original comment by ramosian.glider@gmail.com
on 4 Feb 2014 at 1:52
Attachments:
Original comment by ramosian.glider@gmail.com
on 19 Jun 2014 at 11:12
Progress report.
I've an almost working implementation of globals instrumentation on Linux. The
main problem with the approach suggested above (keeping the descriptors in a
single data section) is that it still doesn't work with --gc-sections, because
that flag naturally removes only dead sections and can't carve a single
descriptor pointing to a dead global out of the section. E.g. for the example
given in #2 the data section .data.BBB won't be removed, because it's
referenced by the live section containing all the global descriptors.
To deal with this we need to make the following changes:
1. Emit the descriptor for each global foo into its own _asan_globals.foo
section
2. Put a pointer to that descriptor at the end of global's redzone.
3. Link with a linker script that merges all the _asan_globals.* sections into
a single _asan_globals one.
The second step is required because otherwise the linker will garbage collect
the descriptors of all globals. The drawback of this approach is that it'll
move all zero-initialized globals from .bss to .data, where they'll occupy
actual disk space.
Original comment by ramosian.glider@gmail.com
on 23 Jun 2014 at 1:52
Yesterday we've discussed the possibility to make some weak reference between
the global descriptor and the global, talking about some analog of a weak
symbol (if the global is deleted the pointer becomes 0). However this is
impossible if the global and the descriptor are in the same object module (we
can't change the global's linkage to be extern_weak).
Original comment by ramosian.glider@gmail.com
on 24 Jun 2014 at 12:46
Another idea suggested by Evgeniy to avoid bloating the zero-initialized
globals:
1. For each global we create its descriptor referencing that global.
2. For each global in the .data section:
a) put a pointer to that global's descriptor into its redzone;
b) for each zero-initialized global from the .bss section of the same module referenced by this global, add a pointer to that global's descriptors to the parent global's redzone.
3. For each function referencing a global, add a reference to that global's
descriptor to that function.
The only problem here is that we can't easily reference anything from a
function.
Original comment by ramosian.glider@gmail.com
on 24 Jun 2014 at 1:08
Yet another idea from Dima Polukhin: make weak references from descriptor array
to instrumented globals so that when a global is dead-stripped the descriptor
is retained.
Not sure if this is supported on Linux (maybe) and OSX (probably no).
Original comment by ramosian.glider@gmail.com
on 25 Aug 2014 at 3:44
Original issue reported on code.google.com by
ramosian.glider@gmail.com
on 31 Jan 2014 at 2:54