viralcode / memory-sanitizer

Automatically exported from code.google.com/p/memory-sanitizer
1 stars 0 forks source link

False aliasing between shadow, origin and app memory? #7

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The following code,

void f(int* p, int x) {
  *p = x;
}

with -O2 -mllvm -msan-store-clean-origin=0 -mllvm -msan-track-origins=1 -mllvm 
-msan-check-access-address=0 is compiled into

_Z1fPii():
   0:   48 b8 ff ff ff ff ff    mov    $0xffffbfffffffffff,%rax
   7:   bf ff ff 
   a:   48 21 f8                and    %rdi,%rax
   d:   48 8b 0d 00 00 00 00    mov    0x0(%rip),%rcx        # 14 <_Z1fPii+0x14>
            10: R_X86_64_GOTTPOFF   __msan_param_origin_tls-0x4
  14:   64 8b 49 08             mov    %fs:0x8(%rcx),%ecx
  18:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 1f <_Z1fPii+0x1f>
            1b: R_X86_64_GOTTPOFF   __msan_param_tls-0x4
  1f:   64 8b 52 08             mov    %fs:0x8(%rdx),%edx
  23:   89 10                   mov    %edx,(%rax)
  25:   85 d2                   test   %edx,%edx
  27:   74 13                   je     3c <_Z1fPii+0x3c>
  29:   48 ba 00 00 00 00 00    mov    $0x200000000000,%rdx
  30:   20 00 00 
  33:   48 01 d0                add    %rdx,%rax
  36:   48 83 e0 fc             and    $0xfffffffffffffffc,%rax
  3a:   89 08                   mov    %ecx,(%rax)
  3c:   89 37                   mov    %esi,(%rdi)
  3e:   c3                      retq   

Looks like the load from origin tls could be moved below the branch at offset 
27, but it's not, because it may be aliasing the store of the shadow value at 
offset 23.

Does it sound right? Could we use TBAA to tell LLVM that these are different 
memory locations?

Original issue reported on code.google.com by euge...@google.com on 19 Nov 2012 at 12:43

GoogleCodeExporter commented 9 years ago
+1 for using TBAA. I can handle it later, once the rest of the code is in. 

Original comment by kcc@chromium.org on 19 Nov 2012 at 12:48

GoogleCodeExporter commented 9 years ago
Humm... what if you just explicitly move the load below the branch? It looks 
like the way to go. Then the compiler does not need to figure out anything.

Original comment by dvyu...@google.com on 19 Nov 2012 at 12:56

GoogleCodeExporter commented 9 years ago
tbaa is the general mechanism, if we can reply on it and save a bit of code in 
our phase -- we should

Original comment by konstant...@gmail.com on 19 Nov 2012 at 1:04

GoogleCodeExporter commented 9 years ago
Yes, there must be a lot of other optimization possibilities suppressed by this 
issue. Asan may benefit from tbaa, too.

Original comment by euge...@google.com on 19 Nov 2012 at 1:14

GoogleCodeExporter commented 9 years ago
It seems like llvm (and gcc, too) simply does not do this optimization.
See an example:

int f(bool z, long long* y, long long* y1, int* p, int* p1) {
  long long x = *y;
  *p1 = 0;
  if (__builtin_expect(z, 0)) {
    *y1 = x;
  }
  return 2;
}

I could not make any of the compilers move the load of (*y) under if().

Original comment by euge...@google.com on 22 Nov 2012 at 11:12