gopalshankar / address-sanitizer

Automatically exported from code.google.com/p/address-sanitizer
0 stars 0 forks source link

NS_IsMainThread hits "heap-buffer-overflow read of size 4" that is "located 0 bytes inside of 4-byte region" #204

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. 
https://developer.mozilla.org/en-US/docs/Developer_Guide/Build_Instructions/Linu
x_Prerequisites
2. hg clone -r c5ce065936fa https://hg.mozilla.org/mozilla-central/
3. MOZCONFIG=mozconfig-asan make -f client.mk
4. Try to run Firefox

What is the expected output? What do you see instead?

==21415==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x60200001cdb0 at pc 0x7f16c7c39a3d bp 0x7fff8884cc60 sp 0x7fff8884cc58
READ of size 4 at 0x60200001cdb0 thread T0
    #0 0x7f16c7c39a3c in _Z15NS_IsMainThreadv /home/jruderman/trees/mozilla-central/obj-firefox-asan/xpcom/ds/../../dist/include/nsThreadUtils.h:104
...
[callers vary from run to run]
...
0x60200001cdb0 is located 0 bytes inside of 4-byte region 
[0x60200001cdb0,0x60200001cdb4)
allocated by thread T0 here:
    #0 0x441f5b in __interceptor_memalign _asan_rtl_
    #1 0x7f16d34a1364 in allocate_and_init /build/buildd/eglibc-2.15/elf/dl-tls.c:526

=>0x0c047fffb9b0: fa fa 00 04 fa fa[04]fa fa fa 00 fa fa fa 00 fa

What version of the product are you using? On what operating system?

LLVM r185246 on Ubuntu Linux in a Mozilla datacenter

Please provide any additional information below.

decoder hit a similar problem when he tried to apply the patch from 
http://code.google.com/p/address-sanitizer/issues/detail?id=193 to the 3.3 
branch.

NS_IsMainThread compares a value in thread-local storage to an enum value:

http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsThreadUtils.h#102
http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsThreadIDs.h

If I build Firefox debug, I hit this in NS_IsCycleCollectorThread instead of 
NS_IsMainThread.

Original issue reported on code.google.com by jruder...@gmail.com on 29 Jun 2013 at 2:33

Attachments:

GoogleCodeExporter commented 9 years ago
Also happens with r183646, the revision before the patch for 
http://code.google.com/p/address-sanitizer/issues/detail?id=193.  So I guess 
it's not a regression from that.

Original comment by jruder...@gmail.com on 29 Jun 2013 at 8:45

GoogleCodeExporter commented 9 years ago
This keeps popping up with our builds using LLVM r185949:

==5329==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000400d0 
at pc 0x7fb4e2f4ba5d bp 0x7fff67ab42a0 sp 0x7fff67ab4298
READ of size 4 at 0x6020000400d0 thread T0
[...]
0x6020000400d0 is located 0 bytes inside of 4-byte region 
[0x6020000400d0,0x6020000400d4)
[...]
allocated by thread T0 here:
    #0 0x4461fb in __interceptor_memalign _asan_rtl_
    #1 0x7fb4f1743364 in allocate_and_init /build/buildd/eglibc-2.15/elf/dl-tls.c:526
[...]
=>0x0c0480000010: fa fa 00 04 fa fa 00 04 fa fa[04]fa fa fa 00 fa

I totally don't understand the trace though. Can you guys help out here?

Original comment by decoder...@googlemail.com on 18 Jul 2013 at 3:53

GoogleCodeExporter commented 9 years ago
Weird indeed. 
I can easily reproduce the failure. 
The disasm: 
Dump of assembler code for function 
nsObserverService::AddObserver(nsIObserver*, char const*, bool):
...
   0x00007fffec7466ba <+26>:    callq  0x7fffe80d4f90 <__tls_get_addr@plt>
   0x00007fffec7466bf <+31>:    mov    %rax,%rcx
   0x00007fffec7466c2 <+34>:    lea    0x0(%rcx),%rcx
   0x00007fffec7466c9 <+41>:    shr    $0x3,%rcx
   0x00007fffec7466cd <+45>:    mov    0x7fff8000(%rcx),%dl
   0x00007fffec7466d3 <+51>:    test   %dl,%dl
   0x00007fffec7466d5 <+53>:    je     0x7fffec7466f5 <nsObserverService::AddObserver(nsIObserver*, char const*, bool)+85>
   0x00007fffec7466d7 <+55>:    mov    %rax,%rcx
=> 0x00007fffec7466da <+58>:    movabs $0x8d0ec84,%rsi
   0x00007fffec7466e4 <+68>:    add    %ecx,%esi
   0x00007fffec7466e6 <+70>:    and    $0x7,%esi
   0x00007fffec7466e9 <+73>:    add    $0x3,%esi
   0x00007fffec7466ec <+76>:    cmp    %dl,%sil
   0x00007fffec7466ef <+79>:    jge    0x7fffec74680e <nsObserverService::AddObserver(nsIObserver*, char const*, bool)+366>

The instruction "movabs $0x8d0ec84,%rsi" is shocking! 
Will continue investigating. 

Original comment by konstant...@gmail.com on 19 Jul 2013 at 9:42

GoogleCodeExporter commented 9 years ago
LLVM IR looks sane: 
  %0 = load i8* inttoptr (i64 add (i64 lshr (i64 ptrtoint (i32* @gTLSThreadID to i64), i64 3), i64 2147450880) to i8*), !dbg !3298
  %1 = icmp ne i8 %0, 0, !dbg !3298
  br i1 %1, label %2, label %5

; <label>:2                                       ; preds = %entry
  %3 = icmp sge i8 trunc (i64 add (i64 and (i64 ptrtoint (i32* @gTLSThreadID to i64), i64 7), i64 3) to i8), %0
  br i1 %3, label %4, label %5

; <label>:4                                       ; preds = %2
  call void @__asan_report_load4(i64 ptrtoint (i32* @gTLSThreadID to i64)), !dbg !3298

Original comment by konstant...@gmail.com on 19 Jul 2013 at 10:18

GoogleCodeExporter commented 9 years ago
minimal repro: 
% head x.cc  main.cc 
==> x.cc <==
#pragma GCC visibility push(hidden)
extern __thread int x;
int foo() {
  return x;
}

==> main.cc <==
__thread int x;
int foo();
int main() {
  return foo();
}
% clang -O -fsanitize=address x.cc main.cc -fPIC 
% objdump -d a.out | less

000000000043ca40 <_Z3foov>:
  43ca40:       50                      push   %rax
  43ca41:       66 66 66 64 48 8b 04    data32 data32 data32 mov %fs:0x0,%rax
  43ca48:       25 00 00 00 00 
  43ca4d:       48 89 c1                mov    %rax,%rcx
  43ca50:       48 8d 89 fc ff ff ff    lea    -0x4(%rcx),%rcx
  43ca57:       48 c1 e9 03             shr    $0x3,%rcx
  43ca5b:       8a 91 00 80 ff 7f       mov    0x7fff8000(%rcx),%dl
  43ca61:       84 d2                   test   %dl,%dl
  43ca63:       74 1a                   je     43ca7f <_Z3foov+0x3f>
  43ca65:       48 89 c1                mov    %rax,%rcx
  43ca68:       48 be ac c3 64 00 00    movabs $0x64c3ac,%rsi

Original comment by konstant...@gmail.com on 19 Jul 2013 at 11:06

GoogleCodeExporter commented 9 years ago
Filed upstream bug: http://llvm.org/bugs/show_bug.cgi?id=16660

Original comment by konstant...@gmail.com on 19 Jul 2013 at 11:59

GoogleCodeExporter commented 9 years ago
The bug seems to be in the LLVM code gen.
While you are waiting for the fix, you may want to have a workaround. 
I managed to start firefox and browse a few pages using this patch: 

--- a/xpcom/glue/nsCycleCollectorUtils.h        Fri Jun 28 18:26:53 2013 -0700
+++ b/xpcom/glue/nsCycleCollectorUtils.h        Mon Jul 22 17:34:12 2013 +0400
@@ -14,7 +14,8 @@
 #elif defined(NS_TLS)
 // Defined in nsThreadManager.cpp.
 extern NS_TLS mozilla::threads::ID gTLSThreadID;
-inline bool NS_IsCycleCollectorThread()
+__attribute__((noinline,no_address_safety_analysis))
+static bool NS_IsCycleCollectorThread()
 {
   return gTLSThreadID == mozilla::threads::CycleCollector;
 }
diff -r c5ce065936fa xpcom/glue/nsThreadUtils.h
--- a/xpcom/glue/nsThreadUtils.h        Fri Jun 28 18:26:53 2013 -0700
+++ b/xpcom/glue/nsThreadUtils.h        Mon Jul 22 17:34:12 2013 +0400
@@ -99,7 +99,8 @@
 // This is defined in nsThreadManager.cpp and initialized to `Main` for the
 // main thread by nsThreadManager::Init.
 extern NS_TLS mozilla::threads::ID gTLSThreadID;
-inline bool NS_IsMainThread()
+__attribute__((noinline,no_address_safety_analysis))
+static bool NS_IsMainThread()
 {
   return gTLSThreadID == mozilla::threads::Main;
 }

Basically, make sure that the accesses to gTLSThreadID are not instrumented. 

Original comment by konstant...@gmail.com on 22 Jul 2013 at 1:37

GoogleCodeExporter commented 9 years ago
Is there any update on this? We plan to upgrade our Clang soon and it would be 
nice to include a fix for this.

Original comment by decoder...@googlemail.com on 9 Jan 2014 at 12:30

GoogleCodeExporter commented 9 years ago
Trying the minimal repro from http://llvm.org/bugs/show_bug.cgi?id=16660 : 
clang -O -fsanitize=address x.cc y.cc  -fPIC -shared -o x.so ; objdump -d x.so 
| less -pfoo
00000000000007b0 <_Z3foov>:
 7b0:   50                      push   %rax
 7b1:   48 8d 3d f8 07 20 00    lea    0x2007f8(%rip),%rdi        # 200fb0 <_DYNAMIC+0x1b0>
 7b8:   e8 f3 fe ff ff          callq  6b0 <__tls_get_addr@plt>
 7bd:   48 89 c1                mov    %rax,%rcx
 7c0:   48 8d 89 00 00 00 00    lea    0x0(%rcx),%rcx
 7c7:   48 c1 e9 03             shr    $0x3,%rcx
 7cb:   8a 91 00 80 ff 7f       mov    0x7fff8000(%rcx),%dl
 7d1:   84 d2                   test   %dl,%dl
 7d3:   74 19                   je     7ee <_Z3foov+0x3e>
 7d5:   48 c7 c6 00 00 00 00    mov    $0x0,%rsi
 7dc:   48 89 c1                mov    %rax,%rcx
 7df:   01 ce                   add    %ecx,%esi
 7e1:   83 e6 07                and    $0x7,%esi
 7e4:   83 c6 03                add    $0x3,%esi
 7e7:   0f be ca                movsbl %dl,%ecx
 7ea:   39 ce                   cmp    %ecx,%esi
 7ec:   7d 08                   jge    7f6 <_Z3foov+0x46>
 7ee:   8b 80 00 00 00 00       mov    0x0(%rax),%eax
 7f4:   5a                      pop    %rdx
 7f5:   c3                      retq   
 7f6:   48 8d 80 00 00 00 00    lea    0x0(%rax),%rax
 7fd:   48 89 c7                mov    %rax,%rdi
 800:   e8 9b fe ff ff          callq  6a0 <__asan_report_load4@plt>
 805:   66 66 2e 0f 1f 84 00    data32 nopw %cs:0x0(%rax,%rax,1)

The code looks sane now (much better than before). 
Can you try on Firefox? 

(I did nothing to fix this, probably got fixed independently)

Original comment by konstant...@gmail.com on 9 Jan 2014 at 7:35