find unaligned partially OOB accesses

ramosian-glider commented 8 years ago

Originally reported on Google Code with ID 100

Currently, asan does not detect unaligned partially OOB accesses:
int *x = new int[2]; // 8 bytes: [0,7].
int *u = (int*)((char*)x + 6);
*u = 1;  // Access to range [6-9]

rnk's idea: mark the last 8 bytes with the shadow value '8' instead of '0'. 

This will have two performance problems: 
 (minor) slow path will be taken more frequently for 1-, 2-, and 4-byte accesses
 (major) 8-byte accesses will need slow path too (same for 16- and 32-byte accesses)

If we use larger shadow granularity (16:1 or 32:1 shadow) this will be less of a problem.

Anyway, this is something to try and evaluate.

Reported by konstantin.s.serebryany on 2012-08-13 17:49:58

ramosian-glider commented 8 years ago

Issue 101 has been merged into this issue.

Reported by konstantin.s.serebryany on 2012-08-13 17:54:53

ramosian-glider commented 8 years ago

Implemented part of it in r161937.
-mllvm --asan-always-slow-path uses slow path for all access sizes, not just 1-, 2-,
and 4-byte. 

The slowdown on spec (train) is moderate except for one test,  but once we poison the
shadow in a different way the slowdown may be bigger. 

       400.perlbench,        57.20,        60.70,         1.06
           401.bzip2,        68.50,        68.70,         1.00
             403.gcc,         1.54,         1.84,         1.19
             429.mcf,        23.90,        25.90,         1.08
           445.gobmk,       174.00,       175.00,         1.01
           456.hmmer,       125.00,       126.00,         1.01
           458.sjeng,       196.00,       197.00,         1.01
      462.libquantum,         2.13,         2.63,         1.23
         464.h264ref,       154.00,       157.00,         1.02
         471.omnetpp,       129.00,       136.00,         1.05
           473.astar,       141.00,       142.00,         1.01
       483.xalancbmk,       122.00,       161.00,         1.32 << 
            433.milc,        23.20,        21.90,         0.94
            444.namd,        17.10,        16.40,         0.96
          447.dealII,        43.30,        43.50,         1.00
          450.soplex,         7.76,         8.06,         1.04
          453.povray,        14.40,        15.80,         1.10
             470.lbm,        38.00,        36.50,         0.96
         482.sphinx3,        15.00,        12.50,         0.83

The code size increases is 5%-10%

Reported by konstantin.s.serebryany on 2012-08-15 09:03:10

ramosian-glider commented 8 years ago

Now, a patch as simple as this will allow to catch unaligned OOB accesses. 

--- asan_poisoning.cc   (revision 161755)
+++ asan_poisoning.cc   (working copy)
@@ -36,7 +36,7 @@
   u8 *shadow = (u8*)MemToShadow(addr);
   for (uptr i = 0; i < redzone_size;
        i += SHADOW_GRANULARITY, shadow++) {
-    if (i + SHADOW_GRANULARITY <= size) {
+    if (i + 2 * SHADOW_GRANULARITY <= size) {
       *shadow = 0;  // fully addressable
     } else if (i >= size) {
       *shadow = (SHADOW_GRANULARITY == 128) ? 0xff : value;  // unaddressable

However, it brings another problem: accesses wider than 8 bytes (16 bytes today and
32-byte with AVX in near future). 
If we have 32 byte buffer its shadow will look like 
  00 00 00 08 
and so, if we access 16 or 32 bytes we get a false positive. 

One solution is to instrument every 16-byte access as two 8-byte access. 
That's simple, but will slow down the SSE code. 

Another solution is to use 32:1 shadow mapping. 
The reasons we are currently using 8:1 mapping are: 
   #1 simpler instrumentation for 8-byte and 16-byte accesses
   #2 less frequent slow path for 1-, 2, and 4-byte accesses. 

If we want to handle unaligned accesses, #1 will be lost anyway, so looks like the
32:1 shadow is the way to go with unaligned accesses. 

That's a bit of a revolution in asan and will not allow us to use 16-byte redzones
...

Reported by konstantin.s.serebryany on 2012-08-15 10:07:42

ramosian-glider commented 8 years ago

Why would you get a false positive for a wide access?  I would think there might be
false negatives for a 16-byte access to byte >16 or 32-byte access to byte >0, because
asan would map the address to an earlier shadow byte.

Reported by rnk@google.com on 2012-08-15 15:32:31

ramosian-glider commented 8 years ago

Currently with 8:1 shadow mapping we load 2 bytes of shadow for a 16-byte access
and report an error if those two bytes are non-zero

Reported by konstantin.s.serebryany on 2012-08-15 15:35:19

ramosian-glider commented 8 years ago

Pasting in most of the original message for future reference.  Mostly I was just trying
to remember what the actual shadow looked like.

"""
Couldn't you fix this by using more partial right redzone poisoning?

So in this code:
  for (uintptr_t i = 0; i < redzone_size;
       i += SHADOW_GRANULARITY, shadow++) {
    if (i + SHADOW_GRANULARITY <= size) {
      *shadow = 0;  // fully addressable
    } else if (i >= size) {
      *shadow = (SHADOW_GRANULARITY == 128) ? 0xff : value;  // unaddressable
    } else {
      *shadow = size - i;  // first size-i bytes are addressable
    }
  }

The first if condition would be:
    if (i + SHADOW_GRANULARITY + MAX_MEMORY_ACCESS <= size) { 

And the shadow for char a[8] would look like:
 f1 08 f3
instead of:
 f1 00 f3

Then on a 4-byte access to a[7], it will trip the slowpath, right?  This will probably
hit the slowpath a bunch, especially if you consider some of the wider vector memory
accesses.  AVX can do 32 bytes I think.  You could even go as far as this for a 32-byte
allocation:
fa 20 18 10 08 fb

The performance impact would have to be measured.  You could also keep MAX_MEMORY_ACCESS
at 8 or something reasonable to catch 99% of the bugs without hitting the slowpath
as much.
"""

Reported by rnk@google.com on 2013-02-22 15:52:21

ramosian-glider commented 8 years ago

It's possible to encode shadow as:
S = max(127, number_of_addressable_bytes_from_beginning_of_the_block)
127 because we want the value to be signed, to store special negative values

e.g. for ShadowScale=3 and block size 19:
S0 = 19
S1 = 11
S2 = 3

This requires slow-path comparison, but catches any possible unaligned and/or partial
OOBs

This looks like preferred encoding for HW implementation:
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerInHardware

Reported by dvyukov@google.com on 2013-08-20 13:38:49

ramosian-glider commented 8 years ago

re comment #7: Some addressable memory will not come from malloc/stack/globals and 
its shadow will be zero. So, zero should still mean 'unpoisoned'.

Reported by konstantin.s.serebryany on 2013-09-10 05:51:51

ramosian-glider commented 8 years ago

Agree. It needs a check with zero, or probably for hardware may be better to offset
the value by a constant.

Reported by dvyukov@google.com on 2013-09-11 20:47:20

ramosian-glider commented 8 years ago

Reported by ramosian.glider on 2015-07-30 09:05:30

Labels added: ProjectAddressSanitizer

ramosian-glider commented 8 years ago

Adding Project:AddressSanitizer as part of GitHub migration.

Reported by ramosian.glider on 2015-07-30 09:06:55

ramosian-glider / sanitizer-issues

find unaligned partially OOB accesses #100