steleman / address-sanitizer

Automatically exported from code.google.com/p/address-sanitizer
0 stars 0 forks source link

Use Linux madvise(MADV_DONTDUMP) to exclude ASan shadow regions from core dumps #345

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
AddressSanitizer maps huge regions to support its state tracking, so core dumps 
from ASan-managed processes are very large on 32-bit and unmanageably large on 
64-bit.  The default-enabled feature of disabling core dumps prevents these 
dumps from being generated, but in some cases, it would be very useful to get a 
manageable dump from an ASan-enabled process.  On Linux 3.4 and later, the 
system call madvise accepts the command MADV_DONTDUMP to exclude a region from 
being written to a core file.  The attached proof of concept patch uses this 
command to exclude the ASan shadow ranges.  A test program using the patched 
libsanitizer generates core files that, although larger than an ASan-free 
build, are quite manageable (~151M core for a trivial crash program).  This 
test was done with the libsanitizer that ships with gcc-4.9, but should apply 
equally to clang libsanitizer.

Original issue reported on code.google.com by google.8...@spamgourmet.com on 23 Sep 2014 at 3:00

Attachments:

GoogleCodeExporter commented 9 years ago
Hm, this looks much better than disabling core dumping completely.

Original comment by euge...@google.com on 23 Sep 2014 at 1:43

GoogleCodeExporter commented 9 years ago
did you try ASAN_OPTIONS=unmap_shadow_on_exit=1 ? 

Original comment by konstant...@gmail.com on 23 Sep 2014 at 4:30

GoogleCodeExporter commented 9 years ago
Regarding ASAN_OPTIONS=unmap_shadow_on_exit=1:

No, I did not previously try that.  It's not listed on the Wiki page of known 
flags <https://code.google.com/p/address-sanitizer/wiki/Flags> and I did not 
notice it while reading the libsanitizer source to implement the madvise patch. 
 However, now that you pointed it out, I tried it and it does not seem to do 
what I want.  I see that it is supported in both gcc-4.8 and gcc-4.9.  To test 
it, I used a program where main() calls abort(), to simulate a program which 
dies due to failing an internal consistency check, as opposed to dying because 
AddressSanitizer found a memory misuse.  I compiled it with both gcc-4.8, which 
uses the stock libsanitizer, and with gcc-4.9, which has my proof of concept 
madvise patch applied.

When using the stock libsanitizer of gcc-4.8 with 
ASAN_OPTIONS=unmap_shadow_on_exit=1,disable_core=0 ./abort48, the program goes 
to 100% CPU in system mode and ultimately generates an apparently 14T core 
(actual size per du: 3.4M).  I also tried using a colon to separate the 
options, with the same result.  The core took about 2 minutes to write, despite 
its effective very small size.

When using the locally patched libsanitizer of gcc-4.9 with 
ASAN_OPTIONS=disable_core=0 ./abort49, the program dumps an apparently 48M core 
(actual size per du: 2.4M) and exits almost instantly.

I also tried a test program which writes to *(int*)nullptr, to trigger an 
AddressSanitizer trap.  In that case, I needed to add abort_on_error=1, 
otherwise no core file was generated after AddressSanitizer trapped the 
SIGSEGV.  This is a bit counterintuitive, since an unsanitized program would 
have dumped core on a null pointer write, but an AddressSanitizer-instrumented 
program requires both disable_core=0 *and* abort_on_error=1 to dump core on a 
null pointer write.  I expected abort_on_error=1 was only needed if I wanted to 
abort on errors found specific to AddressSanitizer (redzone, malloc/delete 
mismatch, etc.) and that regular errors were affected only by disable_core.  
Using abort_on_error=1 here also generates a core file that is recorded as an 
ABRT (gdb says "Program terminated with signal SIGABRT, Aborted."), which while 
technically true, is misleading since the abort happened in response to a 
SIGSEGV.  The frame where the SIGSEGV happened seems to be well recorded, so 
the core is still usable for debugging.
- Using ASAN_OPTIONS=disable_core=0 ./segv48, I get an AddressSanitizer report, 
no core file, and immediate return to shell.
- Using ASAN_OPTIONS=disable_core=0:abort_on_error=1 ./segv48, I get the long 
stall and 14T core file, returning to the shell after about 2 minutes.
- Using ASAN_OPTIONS=disable_core=0:unmap_shadow_on_exit=1 ./segv48, I get no 
core file and an immediate return to shell.
- Using ASAN_OPTIONS=disable_core=0:abort_on_error=1:unmap_shadow_on_exit=1 
./segv48, I get an abort and immediate small core file.
Thus, I understand what you were hoping to see when you suggested 
unmap_shadow_on_exit=1, but it does not solve the problem fully, since it only 
works on scenarios where AddressSanitizer triggered the abort, but not on 
scenarios where the program called abort() on its own (nor scenarios with less 
common core-generating signals, such as SIGQUIT).  Marking the tables as 
not-dumpable ensures that they are not written regardless of why the kernel 
writes a core dump.

Original comment by google.8...@spamgourmet.com on 24 Sep 2014 at 10:41

GoogleCodeExporter commented 9 years ago
makes sense. Please send a patch to llvm-commits@ (see 
https://code.google.com/p/address-sanitizer/wiki/HowToContribute)

Note that we do not #include system headers in asan_rtl.cc.
You will need to create a separate function similar to 
FlushUnneededShadowMemory in 
lib/sanitizer_common/sanitizer_posix_libcdep.cc

The change will also need a tests in test/asan/TestCases/Linux

Thanks for the detailed explanation!

Original comment by konstant...@gmail.com on 24 Sep 2014 at 10:54

GoogleCodeExporter commented 9 years ago
Can somebody please summarize whether here is something actionable for me to do 
or not (I am assigned as the owner)?

Original comment by dvyu...@google.com on 26 Sep 2014 at 2:16

GoogleCodeExporter commented 9 years ago

Original comment by konstant...@gmail.com on 26 Sep 2014 at 4:45

GoogleCodeExporter commented 9 years ago
I am not a regular committer on llvm or gcc.  I posted the original attachment 
as a demonstration for how to implement the change, with the hope that it would 
be useful for other one-off users and as a reference for whoever picks up the 
change to merge it into libsanitizer.  I do not expect to have enough free time 
soon to complete all the steps required to merge the change.

Original comment by google.8...@spamgourmet.com on 27 Sep 2014 at 12:42

GoogleCodeExporter commented 9 years ago
I was wondering if we could do this for both asan and tsan and made this patch. 
Would be great if someone could comment on it.

Original comment by nischay...@gmail.com on 3 Dec 2014 at 10:12

Attachments:

GoogleCodeExporter commented 9 years ago
Sure, the approach makes sense. 
You are more than welcome to contribute the full patch:
  - add a run-time flag use_madv_dontdump, on by default
  - add a test which tests use_madv_dontdump=1, use_madv_dontdump=0, and default

Original comment by konstant...@gmail.com on 3 Dec 2014 at 5:12

GoogleCodeExporter commented 9 years ago
I've filed a patch for this at http://reviews.llvm.org/D7294

Original comment by tetra20...@gmail.com on 30 Jan 2015 at 2:59