find unaligned partially OOB accesses

GoogleCodeExporter commented 9 years ago

Currently, asan does not detect unaligned partially OOB accesses:
int *x = new int[2]; // 8 bytes: [0,7].
int *u = (int*)((char*)x + 6);
*u = 1;  // Access to range [6-9]

rnk's idea: mark the last 8 bytes with the shadow value '8' instead of '0'. 

This will have two performance problems: 
 (minor) slow path will be taken more frequently for 1-, 2-, and 4-byte accesses
 (major) 8-byte accesses will need slow path too (same for 16- and 32-byte accesses)

If we use larger shadow granularity (16:1 or 32:1 shadow) this will be less of 
a problem. 

Anyway, this is something to try and evaluate.

Original issue reported on code.google.com by konstant...@gmail.com on 13 Aug 2012 at 5:49

GoogleCodeExporter commented 9 years ago

Issue 101 has been merged into this issue.

Original comment by konstant...@gmail.com on 13 Aug 2012 at 5:54

GoogleCodeExporter commented 9 years ago

Implemented part of it in r161937.
-mllvm --asan-always-slow-path uses slow path for all access sizes, not just 
1-, 2-, and 4-byte. 

The slowdown on spec (train) is moderate except for one test,  but once we 
poison the shadow in a different way the slowdown may be bigger. 

       400.perlbench,        57.20,        60.70,         1.06
           401.bzip2,        68.50,        68.70,         1.00
             403.gcc,         1.54,         1.84,         1.19
             429.mcf,        23.90,        25.90,         1.08
           445.gobmk,       174.00,       175.00,         1.01
           456.hmmer,       125.00,       126.00,         1.01
           458.sjeng,       196.00,       197.00,         1.01
      462.libquantum,         2.13,         2.63,         1.23
         464.h264ref,       154.00,       157.00,         1.02
         471.omnetpp,       129.00,       136.00,         1.05
           473.astar,       141.00,       142.00,         1.01
       483.xalancbmk,       122.00,       161.00,         1.32 << 
            433.milc,        23.20,        21.90,         0.94
            444.namd,        17.10,        16.40,         0.96
          447.dealII,        43.30,        43.50,         1.00
          450.soplex,         7.76,         8.06,         1.04
          453.povray,        14.40,        15.80,         1.10
             470.lbm,        38.00,        36.50,         0.96
         482.sphinx3,        15.00,        12.50,         0.83

The code size increases is 5%-10%

Original comment by konstant...@gmail.com on 15 Aug 2012 at 9:03

GoogleCodeExporter commented 9 years ago

Now, a patch as simple as this will allow to catch unaligned OOB accesses. 

--- asan_poisoning.cc   (revision 161755)
+++ asan_poisoning.cc   (working copy)
@@ -36,7 +36,7 @@
   u8 *shadow = (u8*)MemToShadow(addr);
   for (uptr i = 0; i < redzone_size;
        i += SHADOW_GRANULARITY, shadow++) {
-    if (i + SHADOW_GRANULARITY <= size) {
+    if (i + 2 * SHADOW_GRANULARITY <= size) {
       *shadow = 0;  // fully addressable
     } else if (i >= size) {
       *shadow = (SHADOW_GRANULARITY == 128) ? 0xff : value;  // unaddressable

However, it brings another problem: accesses wider than 8 bytes (16 bytes today 
and 32-byte with AVX in near future). 
If we have 32 byte buffer its shadow will look like 
  00 00 00 08 
and so, if we access 16 or 32 bytes we get a false positive. 

One solution is to instrument every 16-byte access as two 8-byte access. 
That's simple, but will slow down the SSE code. 

Another solution is to use 32:1 shadow mapping. 
The reasons we are currently using 8:1 mapping are: 
   #1 simpler instrumentation for 8-byte and 16-byte accesses
   #2 less frequent slow path for 1-, 2, and 4-byte accesses. 

If we want to handle unaligned accesses, #1 will be lost anyway, so looks like 
the 32:1 shadow is the way to go with unaligned accesses. 

That's a bit of a revolution in asan and will not allow us to use 16-byte 
redzones ...

Original comment by konstant...@gmail.com on 15 Aug 2012 at 10:07

GoogleCodeExporter commented 9 years ago

Why would you get a false positive for a wide access?  I would think there 
might be false negatives for a 16-byte access to byte >16 or 32-byte access to 
byte >0, because asan would map the address to an earlier shadow byte.

Original comment by rnk@google.com on 15 Aug 2012 at 3:32

GoogleCodeExporter commented 9 years ago

Currently with 8:1 shadow mapping we load 2 bytes of shadow for a 16-byte access
and report an error if those two bytes are non-zero

Original comment by konstant...@gmail.com on 15 Aug 2012 at 3:35

GoogleCodeExporter commented 9 years ago

Pasting in most of the original message for future reference.  Mostly I was 
just trying to remember what the actual shadow looked like.

"""
Couldn't you fix this by using more partial right redzone poisoning?

So in this code:
  for (uintptr_t i = 0; i < redzone_size;
       i += SHADOW_GRANULARITY, shadow++) {
    if (i + SHADOW_GRANULARITY <= size) {
      *shadow = 0;  // fully addressable
    } else if (i >= size) {
      *shadow = (SHADOW_GRANULARITY == 128) ? 0xff : value;  // unaddressable
    } else {
      *shadow = size - i;  // first size-i bytes are addressable
    }
  }

The first if condition would be:
    if (i + SHADOW_GRANULARITY + MAX_MEMORY_ACCESS <= size) { 

And the shadow for char a[8] would look like:
 f1 08 f3
instead of:
 f1 00 f3

Then on a 4-byte access to a[7], it will trip the slowpath, right?  This will 
probably hit the slowpath a bunch, especially if you consider some of the wider 
vector memory accesses.  AVX can do 32 bytes I think.  You could even go as far 
as this for a 32-byte allocation:
fa 20 18 10 08 fb

The performance impact would have to be measured.  You could also keep 
MAX_MEMORY_ACCESS at 8 or something reasonable to catch 99% of the bugs without 
hitting the slowpath as much.
"""

Original comment by rnk@google.com on 22 Feb 2013 at 3:52

GoogleCodeExporter commented 9 years ago

It's possible to encode shadow as:
S = max(127, number_of_addressable_bytes_from_beginning_of_the_block)
127 because we want the value to be signed, to store special negative values

e.g. for ShadowScale=3 and block size 19:
S0 = 19
S1 = 11
S2 = 3

This requires slow-path comparison, but catches any possible unaligned and/or 
partial OOBs

This looks like preferred encoding for HW implementation:
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerInHardware

Original comment by dvyu...@google.com on 20 Aug 2013 at 1:38

GoogleCodeExporter commented 9 years ago

re comment #7: Some addressable memory will not come from malloc/stack/globals 
and 
its shadow will be zero. So, zero should still mean 'unpoisoned'.

Original comment by konstant...@gmail.com on 10 Sep 2013 at 5:51

GoogleCodeExporter commented 9 years ago

Agree. It needs a check with zero, or probably for hardware may be better to 
offset the value by a constant.

Original comment by dvyu...@google.com on 11 Sep 2013 at 8:47

staticanalysis / address-sanitizer

find unaligned partially OOB accesses #100