ramosian-glider / sanitizers

0 stars 0 forks source link

consider using _Unwind unwinder instead of fast fp-based unwinder as an option #138

Open ramosian-glider opened 9 years ago

ramosian-glider commented 9 years ago

Originally reported on Google Code with ID 137

asan currently uses a fast frame-pointer-based unwinder on x86/x86_64 {Linux,Mac}
There are two issues: 
  1. It requires -fno-omit-frame-pointers 
  2. unwinding may fail if there is a libc function somewhere in the stack (e.g. qsort)

We may want to have slow CFI-based unwinder (_Unwind*) as an option. 
We need to separately control the unwinding on fatal error (in __asan_report_error)
and on malloc/free since the latter are very performance critical. 

The default for malloc/free unwinder should remain to be fast.

Reported by konstantin.s.serebryany on 2012-12-13 07:42:57

ramosian-glider commented 9 years ago
Issue 133 has been merged into this issue.

Reported by samsonov@google.com on 2012-12-13 07:45:01

ramosian-glider commented 9 years ago
Added asan/lit_tests/overflow-in-qsort.cc

Reported by konstantin.s.serebryany on 2012-12-13 08:06:28

ramosian-glider commented 9 years ago
Reid,
How does DrMemory handle FPO code?

Reported by timurrrr@google.com on 2012-12-13 08:46:57

ramosian-glider commented 9 years ago
http://llvm.org/viewvc/llvm-project?rev=170117&view=rev adds two flags

  // Use fast (frame-pointer-based) unwinder on fatal errors (if available).
  bool fast_unwind_on_fatal;
  // Use fast (frame-pointer-based) unwinder on malloc/free (if available).
  bool fast_unwind_on_malloc;

This feature still needs some love (more tests, testing, etc)

Reported by konstantin.s.serebryany on 2012-12-13 10:07:08

ramosian-glider commented 9 years ago
There are two tests now: 
asan/lit_tests/Linux/malloc-in-qsort.cc
asan/lit_tests/Linux/overflow-in-qsort.cc

StackTrace::SlowUnwindStack does a nasty business of popping few internal frames.
Seems to work for me on x86_64 linux, but I am not sure if it works on other platforms...

Reported by konstantin.s.serebryany on 2012-12-13 12:35:26

ramosian-glider commented 9 years ago
Some performance numbers on 483.xalancbmk built w/ clang r171185 
(Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz)

clang -O2: 193 seconds
asan, no unwind: 500 seconds
asan, fast unwind, 10-30 frames: 500-530 seconds
asan, slow unwind, 10 frames: 1500 seconds
asan, slow unwind, 20 frames: 2500 seconds
asan, slow unwind, 30 frames: 3300 seconds

So, I am quite certain that fast_unwind_on_malloc should remain 1 by default. 

Note: 483.xalancbmk is the most stresfull test for asan (see http://code.google.com/p/address-sanitizer/wiki/PerformanceNumbers)
and is one of two malloc-intensive tests in SPEC 2006. 

Reported by konstantin.s.serebryany on 2013-01-02 17:26:16

ramosian-glider commented 9 years ago
How much does it affect real world applications, like Chrome?

If the difference is that big, would it make sense implementing our own CFI-based unwinder,
like the one valgrind has? It does not have to be _that_ slow.

Reported by eugenis@google.com on 2013-01-02 17:55:19

ramosian-glider commented 9 years ago
Situations on Chrome and xalancbmk are very similar. 
E.g. on a simple DumpRenderTree benchmark: 
no unwind (or fast unwind): 3 seconds
slow unwind 10 frames: 11 seconds
slow unwind 20 frames: 19 seconds
slow unwind 30 frames: 24 seconds

Profile: 
slow unwind:     
    22.58%         DRT1  libgcc_s.so.1               [.] uw_update_context_1
    15.98%         DRT1  libgcc_s.so.1               [.] execute_cfa_program
    12.36%         DRT1  libgcc_s.so.1               [.] _Unwind_IteratePhdrCallback
     9.18%         DRT1  libgcc_s.so.1               [.] uw_frame_state_for
     4.39%         DRT1  libpthread-2.15.so          [.] pthread_mutex_unlock
     4.24%         DRT1  libpthread-2.15.so          [.] pthread_mutex_lock

with fast unwind: 
     3.04%         DRT1  DRT1                        [.] WebCore::RenderBlock::LineBreaker::nextSegmentBreak(WebCore::BidiResolver<WebCore::Inl
     2.98%         DRT1  DRT1                        [.] WebCore::HTMLTokenizer::nextToken(WebCore::SegmentedString&,
WebCore::HTMLToken&)
     2.84%         DRT1  [kernel.kallsyms]           [k] 0xffffffff8103b51a
     2.75%         DRT1  DRT1                        [.] __sanitizer::StackTrace::FastUnwindStack(unsigned
long, unsigned long, unsigned long, 
     2.70%         DRT1  DRT1                        [.] WebCore::BidiResolver<WebCore::InlineIterator,
WebCore::BidiRun>::createBidiRunsForLin
     2.46%         DRT1  DRT1                        [.] unsigned int WebCore::WidthIterator::advanceInternal<WebCore::SurrogatePairAwareTextIt
     2.33%         DRT1  DRT1                        [.] __asan::Allocate(unsigned
long, unsigned long, __sanitizer::StackTrace*, __asan::Alloc
     1.80%         DRT1  DRT1                        [.] __asan::Deallocate(unsigned
char*, __sanitizer::StackTrace*, __asan::AllocType)
     1.62%         DRT1  DRT1                        [.] __sanitizer::StackTrace::CompressStack(__sanitizer::StackTrace*,
unsigned int*, unsign
     1.48%         DRT1  DRT1                        [.] WebCore::RenderStyle::diff(WebCore::RenderStyle
const*, unsigned int&) const
     1.38%         DRT1  libfreetype.so.6.8.0        [.] 0x1f463         
     1.30%         DRT1  DRT1                        [.] __asan::AsanChunkFifoList::Pop()

>> would it make sense implementing our own CFI-based unwinder, like the one valgrind
has? 
If someone gives such unwinder to us (under an appropriate license) we could use it.

But I see no reason to invest our time in it. 

Reported by konstantin.s.serebryany on 2013-01-02 18:31:06

ramosian-glider commented 9 years ago
LLVM r172397 switches the default to fast_unwind_on_fatal=0. 
This only affects linux on x86/x86_64. 

Alex, please check if we can do that on Mac (in which case, move common code to _posix.cc
files, when possible)

Reported by konstantin.s.serebryany on 2013-01-14 11:05:49

ramosian-glider commented 9 years ago
Moved the common code to sanitizer_unwind_posix.cc in Clang r216877.
We can't use the slow unwinder on OSX now, because Clang produces incorrect unwind
info for the ASan runtime functions on OSX (http://llvm.org/PR20800).

Reported by ramosian.glider on 2014-09-01 12:54:46

ramosian-glider commented 9 years ago

Reported by ramosian.glider on 2015-07-30 09:05:31

ramosian-glider commented 9 years ago
Adding Project:AddressSanitizer as part of GitHub migration.

Reported by ramosian.glider on 2015-07-30 09:06:55