Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[ASan] Turn use-after-return checking on by default #17063

Open Quuxplusone opened 11 years ago

Quuxplusone commented 11 years ago
Bugzilla Link PR17064
Status CONFIRMED
Importance P enhancement
Reported by Chandler Carruth (chandlerc@gmail.com)
Reported on 2013-09-02 17:04:06 -0700
Last modified on 2018-10-19 06:00:22 -0700
Version unspecified
Hardware PC All
CC kcc@google.com, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, samsonov@google.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also PR32800, PR39355

Filing this as a tracking bug to track progress on turning ASan's use-after-return checking on by default.

The reason I want this on by default is that there is very little to indicate to a user that a bug in their code is related to use-after-return, and so it seems especially valuable to eagerly diagnose it. Also, it seems likely that by applying LLVM's capture analysis to the instrumentation the overhead associated with this mode could be significantly reduced.

Quuxplusone commented 11 years ago

I'd love this to happen, but I am not sure if this will ever be possible due to large and hard-to-predict extra overhead, both in CPU and RAM. Here is a rough description of how the UAR detection works: https://code.google.com/p/address-sanitizer/wiki/UseAfterReturn

What's "capture analysis"?

In most cases, if the stack frame (local variable) survives after all LLVM optimizations, it means that some of the variables potentially escape.

The exception is code like this: void foo(...) { // arr provably does not escape foo() because its address is never taken int arr[100];
... arr[i] ... } If we could use LLVM analysis to prove that arr does not escape, this indeed would help.

Another problem is compatibility. The UAR mode is less compatible with an arbitrary code than the default ASAN. For example, instrumenting V8 makes it reports false positives on Chromium (the V8 code uses some assumptions about the stack layout).

Quuxplusone commented 11 years ago

(In reply to comment #1)

I'd love this to happen, but I am not sure if this will ever be possible due to large and hard-to-predict extra overhead, both in CPU and RAM. Here is a rough description of how the UAR detection works: https://code.google.com/p/address-sanitizer/wiki/UseAfterReturn

What's "capture analysis"?

It's a subset of escape. Essentially, it is escape which outlives the function as opposed to escape which merely hands the address to unknown other code.

Nick is the expert here, but I'm also happy to explain in more detail... IRC might work better than a bug though.

The nice thing is that LLVM will aggressively track capture working bottom up across the call graph. So it should be possible to distinguish between a stack variable which is passed down to other functions and one which might have its address outlive the function body.

See include/llvm/Analysis/CaptureTracking.h for the interface to leverage this analysis in LLVM.

There also might be other ways to improve performance of this, for example by using something similar to a cactus stack rather than just heap allocations. Anyways, I would want to understand exactly what the performance impact was and where it came from. That was the point of filing a bug to track it.

If we could use LLVM analysis to prove that arr does not escape, this indeed would help.

Exactly. This is what 'nocapture' can help with. Try out the analysis pass I mentioned maybe? Should be easy to prototype, although it may need tuning to avoid compile time problems.

Another problem is compatibility. The UAR mode is less compatible with an arbitrary code than the default ASAN. For example, instrumenting V8 makes it reports false positives on Chromium (the V8 code uses some assumptions about the stack layout).

I'm pretty happy to start breaking code that makes assumptions about the stack layout. This needs to happen anyways IMO, or at the very least such code needs to explicitly disclaim that it is assuming things about the layout of the stack.

Quuxplusone commented 11 years ago
This is the current performance data: asan vs asan+UAR at -O2
483.xalancbmk will need a re-run (UAR failed due to too deep recursion)

458.sjeng is known to have a large stack array in the hotspot.
UAR mode has to poison the entire thing, which is costly.

Look at e.g. this function:
int search (int alpha, int beta, int depth, int is_null) {
  move_s moves[MOVE_BUFF];  // MOVE_BUFF == 512
  int num_moves, i, j;
  int score = -INF, move_ordering[MOVE_BUFF], see_values[MOVE_BUFF];

This function is recursive, has 500 LOCs and lots of calls.
I don't think it is possible to optimize UAR detection away w/o
whole program mode.

I haven't analyzed other cases yet.

       400.perlbench,      1328.00,      2087.00,         1.57  <<<<<<<<<
           401.bzip2,       847.00,       856.00,         1.01
             403.gcc,       618.00,       666.00,         1.08
             429.mcf,       581.00,       579.00,         1.00
           445.gobmk,       867.00,      1404.00,         1.62  <<<<<<<<<
           456.hmmer,       912.00,       894.00,         0.98
           458.sjeng,      1016.00,      1365.00,         1.34  <<<<<<<<<
      462.libquantum,       535.00,       545.00,         1.02
         464.h264ref,      1274.00,      1346.00,         1.06
         471.omnetpp,       563.00,       574.00,         1.02
           473.astar,       642.00,       640.00,         1.00
       483.xalancbmk,       466.00,        -1.00,        -0.00
            433.milc,       649.00,       653.00,         1.01
            444.namd,       596.00,       595.00,         1.00
          447.dealII,       620.00,       650.00,         1.05
          450.soplex,       361.00,       366.00,         1.01
          453.povray,       422.00,      1986.00,         4.71  <<<<<<<<<
             470.lbm,       377.00,       380.00,         1.01
         482.sphinx3,       956.00,       975.00,         1.02
Quuxplusone commented 11 years ago
483.xalancbmk,       466.00,       786.00,         1.69
This benchmark has insanely deep recursion (depth is over 30K).
Quuxplusone commented 11 years ago
Another issue which I understood just now is that the current implementation
is not signal-safe. This will be tracked separately in
https://code.google.com/p/address-sanitizer/issues/detail?id=217
Quuxplusone commented 11 years ago
I've completely re-implemented the use-after-return allocator,
it is now supposed to be faster and async-signal-safe(er).
Preliminary results on SPEC look better than before:
       400.perlbench,      1353.00,      1695.00,         1.25
           445.gobmk,       889.00,      1123.00,         1.26
           458.sjeng,      1036.00,      1283.00,         1.24
       483.xalancbmk,       496.00,       585.00,         1.18
          453.povray,       441.00,       826.00,         1.87
Testing it further.
Quuxplusone commented 11 years ago
r191186 enables the compile-time instrumentation for UAR detection by default
when asan is aneabled, while keeping the run-time flag off by default.
Now one needs to pass ASAN_OPTIONS=detect_stack_use_after_return=1 at run-time
to enable UAR detection
Quuxplusone commented 11 years ago
With -O1, the UAR slowdown is a bit worse.
this is quite expected: more stack objects survive optimizations
       400.perlbench,      1393.00,      1690.00,         1.21  <<<
           401.bzip2,       891.00,       944.00,         1.06
             403.gcc,       659.00,       694.00,         1.05
             429.mcf,       674.00,       692.00,         1.03
           445.gobmk,       922.00,      1134.00,         1.23  <<<
           456.hmmer,       975.00,       980.00,         1.01
           458.sjeng,      1133.00,      1441.00,         1.27  <<<
      462.libquantum,       576.00,       539.00,         0.94
         464.h264ref,      1424.00,      1411.00,         0.99
         471.omnetpp,       752.00,       700.00,         0.93
           473.astar,       845.00,       884.00,         1.05
       483.xalancbmk,      1198.00,      1975.00,         1.65  <<<
            433.milc,       729.00,       716.00,         0.98
            444.namd,       615.00,       614.00,         1.00
          447.dealII,      1503.00,      2028.00,         1.35  <<<
          450.soplex,       522.00,       513.00,         0.98
          453.povray,       518.00,      1191.00,         2.30  <<<
             470.lbm,       390.00,       401.00,         1.03
         482.sphinx3,      1031.00,       982.00,         0.95
Quuxplusone commented 11 years ago
This is at -O1: pure asan vs asan+UAR but with UAR disabled at run-time
i.e. we are measuring the overhead of instrumentation w/o actually using the
fake stack.

       400.perlbench,      1364.00,      1383.00,         1.01
           401.bzip2,       876.00,       877.00,         1.00
             403.gcc,       676.00,       675.00,         1.00
             429.mcf,       673.00,       674.00,         1.00
           445.gobmk,       909.00,       946.00,         1.04  <<
           456.hmmer,       970.00,       978.00,         1.01
           458.sjeng,      1138.00,      1139.00,         1.00
      462.libquantum,       676.00,       648.00,         0.96
         464.h264ref,      1404.00,      1378.00,         0.98
         471.omnetpp,       717.00,       726.00,         1.01
           473.astar,       785.00,       799.00,         1.02
       483.xalancbmk,      1145.00,      1209.00,         1.06  <<
            433.milc,       742.00,       736.00,         0.99
            444.namd,       613.00,       615.00,         1.00
          447.dealII,      1493.00,      1519.00,         1.02
          450.soplex,       519.00,       525.00,         1.01
          453.povray,       524.00,       573.00,         1.09  <<
             470.lbm,       492.00,       501.00,         1.02
         482.sphinx3,      1035.00,      1052.00,         1.02

Quite tolerable.
Quuxplusone commented 6 years ago
Any changes/roadmap/plan here?
https://github.com/google/sanitizers/issues/217 appears to be fixed, and
nothing else is referenced.

Just wasted an hour debugging quite simple code until realizing that stack-use-
after-return detection is still not on by default.
Quuxplusone commented 6 years ago
Vitaly, WDYT?
I guess we may try turning this on by default
Quuxplusone commented 6 years ago
Sure.
I need investigate some performance issues. I plan to do so in the next couple
of month.
It can be enabled with or without investigation, but if we find easy fix for
issues then less users will need to disable the check.
Quuxplusone commented 6 years ago

If the intent is to enable it, it might be better to do it rather sooner than later, to have more room for reverts, and not let it slip into next release++

Quuxplusone commented 6 years ago
Not sure if we can invest time into this right now. Would you like to try?
We need to make sure that asan build of LLVM doesn't regress too much.
Quuxplusone commented 6 years ago
You mean the time it takes to build stage-2 ?
With which build mode? Release?
Quuxplusone commented 6 years ago
(In reply to Roman Lebedev from comment #15)
> You mean the time it takes to build stage-2 ?
> With which build mode? Release?

Oh, no, I mean our (human) time to make the testing.
I would make the change, build LLVM/Release with asan, run the tests,
submit, look carefully at the bots.

We will also get an inflow of upset users for whom asan will stopp working
(due to incompatibility with use-after-return checking) or became slow, so we
need to budget for that as well.
Quuxplusone commented 6 years ago
(In reply to Kostya Serebryany from comment #16)
> (In reply to Roman Lebedev from comment #15)
> > You mean the time it takes to build stage-2 ?
> > With which build mode? Release?
>
> Oh, no, I mean our (human) time to make the testing.
Ah!

> I would make the change, build LLVM/Release with asan, run the tests,
> submit, look carefully at the bots.
That part is obvious.

> We will also get an inflow of upset users

> for whom asan will stopp working
> (due to incompatibility with use-after-return checking)
Here i'm not following.
This is about turning use-after-return checking on by default,
so you mean "(due to some code out there being incompatible with use-after-
return checking)"?

> or became slow, so
> we need to budget for that as well.
Quuxplusone commented 6 years ago
> > for whom asan will stopp working
> > (due to incompatibility with use-after-return checking)
> Here i'm not following.
> This is about turning use-after-return checking on by default,
> so you mean "(due to some code out there being incompatible with
> use-after-return checking)"?

Yes. Suppose an existing user of asan updates to the new version of LLVM
that now has use-after-return enabled.
The code that used to run fine under asan will not fail because
  * it's slower and it times out
  * it consumes more memory and OOMs.
  * it's incompatible with use-after-return (does dirty tricks with stack) and has false positives or crashes

It's still totally worth the risk.