llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.22k stars 11.65k forks source link

[ASan] Turn use-after-return checking on by default #17438

Closed chandlerc closed 1 year ago

chandlerc commented 11 years ago
Bugzilla Link 17064
Version unspecified
OS All
CC @kcc,@LebedevRI

Extended Description

Filing this as a tracking bug to track progress on turning ASan's use-after-return checking on by default.

The reason I want this on by default is that there is very little to indicate to a user that a bug in their code is related to use-after-return, and so it seems especially valuable to eagerly diagnose it. Also, it seems likely that by applying LLVM's capture analysis to the instrumentation the overhead associated with this mode could be significantly reduced.

kcc commented 6 years ago

for whom asan will stopp working (due to incompatibility with use-after-return checking) Here i'm not following. This is about turning use-after-return checking on by default, so you mean "(due to some code out there being incompatible with use-after-return checking)"?

Yes. Suppose an existing user of asan updates to the new version of LLVM that now has use-after-return enabled. The code that used to run fine under asan will not fail because

It's still totally worth the risk.

LebedevRI commented 6 years ago

You mean the time it takes to build stage-2 ? With which build mode? Release?

Oh, no, I mean our (human) time to make the testing. Ah!

I would make the change, build LLVM/Release with asan, run the tests, submit, look carefully at the bots. That part is obvious.

We will also get an inflow of upset users

for whom asan will stopp working (due to incompatibility with use-after-return checking) Here i'm not following. This is about turning use-after-return checking on by default, so you mean "(due to some code out there being incompatible with use-after-return checking)"?

or became slow, so we need to budget for that as well.

kcc commented 6 years ago

You mean the time it takes to build stage-2 ? With which build mode? Release?

Oh, no, I mean our (human) time to make the testing. I would make the change, build LLVM/Release with asan, run the tests, submit, look carefully at the bots.

We will also get an inflow of upset users for whom asan will stopp working (due to incompatibility with use-after-return checking) or became slow, so we need to budget for that as well.

LebedevRI commented 6 years ago

You mean the time it takes to build stage-2 ? With which build mode? Release?

kcc commented 6 years ago

Not sure if we can invest time into this right now. Would you like to try? We need to make sure that asan build of LLVM doesn't regress too much.

LebedevRI commented 6 years ago

If the intent is to enable it, it might be better to do it rather sooner than later, to have more room for reverts, and not let it slip into next release++

vitalybuka commented 6 years ago

Sure. I need investigate some performance issues. I plan to do so in the next couple of month. It can be enabled with or without investigation, but if we find easy fix for issues then less users will need to disable the check.

kcc commented 6 years ago

Vitaly, WDYT? I guess we may try turning this on by default

LebedevRI commented 6 years ago

Any changes/roadmap/plan here? https://github.com/google/sanitizers/issues/217 appears to be fixed, and nothing else is referenced.

Just wasted an hour debugging quite simple code until realizing that stack-use-after-return detection is still not on by default.

kcc commented 11 years ago

This is at -O1: pure asan vs asan+UAR but with UAR disabled at run-time i.e. we are measuring the overhead of instrumentation w/o actually using the fake stack.

   400.perlbench,      1364.00,      1383.00,         1.01
       401.bzip2,       876.00,       877.00,         1.00
         403.gcc,       676.00,       675.00,         1.00
         429.mcf,       673.00,       674.00,         1.00
       445.gobmk,       909.00,       946.00,         1.04  << 
       456.hmmer,       970.00,       978.00,         1.01
       458.sjeng,      1138.00,      1139.00,         1.00
  462.libquantum,       676.00,       648.00,         0.96
     464.h264ref,      1404.00,      1378.00,         0.98
     471.omnetpp,       717.00,       726.00,         1.01
       473.astar,       785.00,       799.00,         1.02
   483.xalancbmk,      1145.00,      1209.00,         1.06  << 
        433.milc,       742.00,       736.00,         0.99
        444.namd,       613.00,       615.00,         1.00
      447.dealII,      1493.00,      1519.00,         1.02
      450.soplex,       519.00,       525.00,         1.01
      453.povray,       524.00,       573.00,         1.09  << 
         470.lbm,       492.00,       501.00,         1.02
     482.sphinx3,      1035.00,      1052.00,         1.02

Quite tolerable.

kcc commented 11 years ago

With -O1, the UAR slowdown is a bit worse. this is quite expected: more stack objects survive optimizations 400.perlbench, 1393.00, 1690.00, 1.21 <<< 401.bzip2, 891.00, 944.00, 1.06 403.gcc, 659.00, 694.00, 1.05 429.mcf, 674.00, 692.00, 1.03 445.gobmk, 922.00, 1134.00, 1.23 <<< 456.hmmer, 975.00, 980.00, 1.01 458.sjeng, 1133.00, 1441.00, 1.27 <<< 462.libquantum, 576.00, 539.00, 0.94 464.h264ref, 1424.00, 1411.00, 0.99 471.omnetpp, 752.00, 700.00, 0.93 473.astar, 845.00, 884.00, 1.05 483.xalancbmk, 1198.00, 1975.00, 1.65 <<< 433.milc, 729.00, 716.00, 0.98 444.namd, 615.00, 614.00, 1.00 447.dealII, 1503.00, 2028.00, 1.35 <<< 450.soplex, 522.00, 513.00, 0.98 453.povray, 518.00, 1191.00, 2.30 <<< 470.lbm, 390.00, 401.00, 1.03 482.sphinx3, 1031.00, 982.00, 0.95

kcc commented 11 years ago

r191186 enables the compile-time instrumentation for UAR detection by default when asan is aneabled, while keeping the run-time flag off by default. Now one needs to pass ASAN_OPTIONS=detect_stack_use_after_return=1 at run-time to enable UAR detection

kcc commented 11 years ago

I've completely re-implemented the use-after-return allocator, it is now supposed to be faster and async-signal-safe(er). Preliminary results on SPEC look better than before: 400.perlbench, 1353.00, 1695.00, 1.25 445.gobmk, 889.00, 1123.00, 1.26 458.sjeng, 1036.00, 1283.00, 1.24 483.xalancbmk, 496.00, 585.00, 1.18 453.povray, 441.00, 826.00, 1.87 Testing it further.

kcc commented 11 years ago

Another issue which I understood just now is that the current implementation is not signal-safe. This will be tracked separately in https://code.google.com/p/address-sanitizer/issues/detail?id=217

kcc commented 11 years ago
   483.xalancbmk,       466.00,       786.00,         1.69

This benchmark has insanely deep recursion (depth is over 30K).

kcc commented 11 years ago

This is the current performance data: asan vs asan+UAR at -O2 483.xalancbmk will need a re-run (UAR failed due to too deep recursion)

458.sjeng is known to have a large stack array in the hotspot. UAR mode has to poison the entire thing, which is costly.

Look at e.g. this function: int search (int alpha, int beta, int depth, int is_null) { move_s moves[MOVE_BUFF]; // MOVE_BUFF == 512 int num_moves, i, j; int score = -INF, move_ordering[MOVE_BUFF], see_values[MOVE_BUFF];

This function is recursive, has 500 LOCs and lots of calls. I don't think it is possible to optimize UAR detection away w/o whole program mode.

I haven't analyzed other cases yet.

   400.perlbench,      1328.00,      2087.00,         1.57  <<<<<<<<<
       401.bzip2,       847.00,       856.00,         1.01
         403.gcc,       618.00,       666.00,         1.08
         429.mcf,       581.00,       579.00,         1.00
       445.gobmk,       867.00,      1404.00,         1.62  <<<<<<<<<
       456.hmmer,       912.00,       894.00,         0.98
       458.sjeng,      1016.00,      1365.00,         1.34  <<<<<<<<<
  462.libquantum,       535.00,       545.00,         1.02
     464.h264ref,      1274.00,      1346.00,         1.06
     471.omnetpp,       563.00,       574.00,         1.02
       473.astar,       642.00,       640.00,         1.00
   483.xalancbmk,       466.00,        -1.00,        -0.00
        433.milc,       649.00,       653.00,         1.01
        444.namd,       596.00,       595.00,         1.00
      447.dealII,       620.00,       650.00,         1.05
      450.soplex,       361.00,       366.00,         1.01
      453.povray,       422.00,      1986.00,         4.71  <<<<<<<<<
         470.lbm,       377.00,       380.00,         1.01
     482.sphinx3,       956.00,       975.00,         1.02
chandlerc commented 11 years ago

I'd love this to happen, but I am not sure if this will ever be possible due to large and hard-to-predict extra overhead, both in CPU and RAM. Here is a rough description of how the UAR detection works: https://code.google.com/p/address-sanitizer/wiki/UseAfterReturn

What's "capture analysis"?

It's a subset of escape. Essentially, it is escape which outlives the function as opposed to escape which merely hands the address to unknown other code.

Nick is the expert here, but I'm also happy to explain in more detail... IRC might work better than a bug though.

The nice thing is that LLVM will aggressively track capture working bottom up across the call graph. So it should be possible to distinguish between a stack variable which is passed down to other functions and one which might have its address outlive the function body.

See include/llvm/Analysis/CaptureTracking.h for the interface to leverage this analysis in LLVM.

There also might be other ways to improve performance of this, for example by using something similar to a cactus stack rather than just heap allocations. Anyways, I would want to understand exactly what the performance impact was and where it came from. That was the point of filing a bug to track it.

If we could use LLVM analysis to prove that arr does not escape, this indeed would help.

Exactly. This is what 'nocapture' can help with. Try out the analysis pass I mentioned maybe? Should be easy to prototype, although it may need tuning to avoid compile time problems.

Another problem is compatibility. The UAR mode is less compatible with an arbitrary code than the default ASAN. For example, instrumenting V8 makes it reports false positives on Chromium (the V8 code uses some assumptions about the stack layout).

I'm pretty happy to start breaking code that makes assumptions about the stack layout. This needs to happen anyways IMO, or at the very least such code needs to explicitly disclaim that it is assuming things about the layout of the stack.

kcc commented 11 years ago

I'd love this to happen, but I am not sure if this will ever be possible due to large and hard-to-predict extra overhead, both in CPU and RAM. Here is a rough description of how the UAR detection works: https://code.google.com/p/address-sanitizer/wiki/UseAfterReturn

What's "capture analysis"?

In most cases, if the stack frame (local variable) survives after all LLVM optimizations, it means that some of the variables potentially escape.

The exception is code like this: void foo(...) { // arr provably does not escape foo() because its address is never taken int arr[100];
... arr[i] ... } If we could use LLVM analysis to prove that arr does not escape, this indeed would help.

Another problem is compatibility. The UAR mode is less compatible with an arbitrary code than the default ASAN. For example, instrumenting V8 makes it reports false positives on Chromium (the V8 code uses some assumptions about the stack layout).

chandlerc commented 11 years ago

assigned to @vitalybuka

vitalybuka commented 1 year ago

Fixed with 4b4437c084e2b8a2643e97e7aef125c438635a4d