DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.67k stars 562 forks source link

[AArch64] Crash with low shared_ibt_table_bb_init value #4665

Closed abhinav92003 closed 3 years ago

abhinav92003 commented 3 years ago

While working on PR #4638, I found that ibl-stress.c with -DTEST_FAR_LINK_AARCH64 on AArch64 crashes with -shared_ibt_table_bb_init 16 and 18, but works with 20. Similar behaviour was observed on a proprietary app where -shared_ibt_table_bb_init had to be raised to avoid a crash. This was observed to happen with large apps only, which is why-DTEST_FAR_LINK_AARCH64 is required to reproduce the crash.

abhinav92003 commented 3 years ago
$ dynamorio/bin64/runstats "-s" "9000" "-killpg" "-silent" "-env" "LD_LIBRARY_PATH" "dynamorio/lib64/debug:dynamorio/ext/lib64/debug:" "-env" "DYNAMORIO_OPTIONS" "-dumpcore_mask 0 -disable_traces -shared_bb_ibt_tables -shared_ibt_table_bb_init 16  -checklevel 0  -code_api" "dynamorio/suite/tests/bin/api.ibl-stress"
<Starting application dynamorio/suite/tests/bin/api.ibl-stress (34983)>
<Initial options = -checklevel 0 -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -shared_bb_ibt_tables -shared_ibt_table_bb_init 16 -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<pre-building code cache from 1050001 tags>
<all_memory_areas is missing regions including 0x0000fffd9561a000-0x0000fffd986be000>
<Application tried to execute from unreadable memory 0xf9402381f9400b82.
This may be a result of an unsuccessful attack or a potential application vulnerability.>
<Application dynamorio/suite/tests/bin/api.ibl-stress (34983).  Application exception at PC 0xf9402381f9400b82.  
Signal 11 delivered to application as default action.
Callstack:
    0xf9402381f9400b82  
    0x0000ffff98c7a0a0   </lib/aarch64-linux-gnu/libpthread-2.24.so+0x70a0>
    0x0000ffff98788eac   </lib/aarch64-linux-gnu/libc-2.24.so+0xc7eac>
>
<Stopping application dynamorio/suite/tests/bin/api.ibl-stress (34983)>
<Application dynamorio/suite/tests/bin/api.ibl-stress (34983).  Application exception at PC 0xf9402381f9400b82.  
Signal 11 delivered to application as default action.
Callstack:
    0xf9402381f9400b82  
    0x0000ffff98c7a0a0   </lib/aarch64-linux-gnu/libpthread-2.24.so+0x70a0>
    0x0000ffff98788eac   </lib/aarch64-linux-gnu/libc-2.24.so+0xc7eac>
>
<Stopping application dynamorio/suite/tests/bin/api.ibl-stress (34983)>
<Application dynamorio/suite/tests/bin/api.ibl-stress (34983).  Application exception at PC 0xf9402381f9400b82.  
Signal 11 delivered to application as default action.
Callstack:
    0xf9402381f9400b82  
    0x0000ffff98c7a0a0   </lib/aarch64-linux-gnu/libpthread-2.24.so+0x70a0>
    0x0000ffff98788eac   </lib/aarch64-linux-gnu/libc-2.24.so+0xc7eac>
>
<Stopping application dynamorio/suite/tests/bin/api.ibl-stress (34983)>
<Application dynamorio/suite/tests/bin/api.ibl-stress (34983).  Application exception at PC 0xf9402381f9400b82.  
Signal 11 delivered to application as default action.
Callstack:
    0xf9402381f9400b82  
    0x0000ffff98c7a0a0   </lib/aarch64-linux-gnu/libpthread-2.24.so+0x70a0>
    0x0000ffff98788eac   </lib/aarch64-linux-gnu/libc-2.24.so+0xc7eac>
>
<Stopping application dynamorio/suite/tests/bin/api.ibl-stress (34983)>
<Application dynamorio/suite/tests/bin/api.ibl-stress (34983).  Application exception at PC 0xf9402381f9400b82.  
Signal 11 delivered to application as default action.
Callstack:
    0xf9402381f9400b82  
    0x0000ffff98c7a0a0   </lib/aarch64-linux-gnu/libpthread-2.24.so+0x70a0>
    0x0000ffff98788eac   </lib/aarch64-linux-gnu/libc-2.24.so+0xc7eac>
>
<Stopping application dynamorio/suite/tests/bin/api.ibl-stress (34983)>
<Application dynamorio/suite/tests/bin/api.ibl-stress (34983).  Application exception at PC 0xf9402381f9400b82.  
Signal 11 delivered to application as default action.
Callstack:
    0xf9402381f9400b82  
    0x0000ffff98c7a0a0   </lib/aarch64-linux-gnu/libpthread-2.24.so+0x70a0>
    0x0000ffff98788eac   </lib/aarch64-linux-gnu/libc-2.24.so+0xc7eac>
>
<Stopping application dynamorio/suite/tests/bin/api.ibl-stress (34983)>
abhinav92003 commented 3 years ago

The crash also reproduces with the 'regular' version of ibl-stress test in the suite, when no value is set for shared_ibt_table_bb_init

abhinav92003 commented 3 years ago

Looking at when the log Application tried to execute from unreadable memory 0xf9402381f9400b82 is generated, the stack trace is:

#0  my_breakpoint (dcontext=0xfffdb8957600) at /home/abhinavas/dr/src/i4665-1/core/vmareas.c:654 // my own function added for breakpoint
#1  0x0000ffffb7c1a1c8 in check_thread_vm_area (dcontext=0xfffdb8957600, pc=0xf9402381f9400b82 <error: Cannot access memory at address 0xf9402381f9400b82>, 
    tag=0xf9402381f9400b82 <error: Cannot access memory at address 0xf9402381f9400b82>, vmlist=0xfffdb8aa0e10, flags=0xfffdb8aa0e08, stop=0xfffdb8aa0e58, xfer=false) at /home/abhinavas/dr/src/i4665-1/core/vmareas.c:7775
#2  0x0000ffffb7d65858 in check_new_page_start (dcontext=0xfffdb8957600, bb=0xfffdb8aa0dd0) at /home/abhinavas/dr/src/i4665-1/core/arch/interp.c:719
#3  0x0000ffffb7d6a19c in build_bb_ilist (dcontext=0xfffdb8957600, bb=0xfffdb8aa0dd0) at /home/abhinavas/dr/src/i4665-1/core/arch/interp.c:3315
#4  0x0000ffffb7d718e8 in build_basic_block_fragment (dcontext=0xfffdb8957600, start=0xf9402381f9400b82 <error: Cannot access memory at address 0xf9402381f9400b82>, initial_flags=0, link=true, visible=true, 
    for_trace=false, unmangled_ilist=0x0) at /home/abhinavas/dr/src/i4665-1/core/arch/interp.c:5128
#5  0x0000ffffb7b2ce44 in d_r_dispatch (dcontext=0xfffdb8957600) at /home/abhinavas/dr/src/i4665-1/core/dispatch.c:214
#6  0x0000aaaaaaaac96c in wait_cond_var (var=0xaaaaaaac8010) at /home/abhinavas/dr/src/i4665-1/suite/tests/condvar.h:135

At this point dcontext->next_tag is the bad address 0xf9402381f9400b82, and dcontext->last_exit->flags is 0x2000 which is the value for LINK_FAKE

abhinav92003 commented 3 years ago

A couple other observations:

derekbruening commented 3 years ago
  • the last_exit in the comment above is linkstub_syscall.
  • the unreadable memory seems to always be 0xf9402381f9400b82, even in another application.

Is the syscall sigreturn? Stating the obvious I guess: something is corrupted to have a non-canonical target PC: or I guess for x86 it's non-canonical -- not sure what A64 requires in the top bits.

abhinav92003 commented 3 years ago

It is syscall 98, which is futex.

fcache_enter = 0x0000aaaa9d0a3b80, target = 0x0000aaaa9d1a36e0
Exit from F-113245152(0xabababab17fffff8).0xc8dffc2191041c81 (shared)
 (block ends with syscall)
Entry into do_syscall to execute a non-ignorable system call
system call 98
fcache_enter = 0x0000aaaa9d0a3b80, target = 0x0000aaaa9d0a4280
Exit from system call
post syscall: sysnum=0x0000000000000062, result=0xfffffffffffffff2 (-14)
finished handling system call

d_r_dispatch: target = 0xf9402381f9400b82
abhinav92003 commented 3 years ago

Trying to backtrace where the bad target comes from. Before handle_system_call, next_tag is set by https://github.com/DynamoRIO/dynamorio/blob/da71023e38154e2ad136c51c8945a3bde8d1ce9f/core/dispatch.c#L838 EXIT_TARGET_TAG sees linkstub_normal_direct=1 and returns((direct_linkstub_t *)(l))->target_tag

dcontext->last_fragment seems a bit weird though: lastf=aaaa6aabcef4, lastf->tag = abababab17fffff8, lastf->flags=abababab,lastf->size=3968,lastf->start_pc=c8dffc2191032381 gdb cannot read contents at the tag location or the start_pc

derekbruening commented 3 years ago

An 0xab repeated pattern is used in DR debug build to fill allocated (uninitialized) heap space. (Similarly 0xcd repeated indicates use-after-free.)

abhinav92003 commented 3 years ago

I noticed that EXIT_TARGET_TAG (before the handle_system_call call) always sees dcontext->last_exit as 0xaaaa6aabd054. next_tag and last_fragment both are computed based on this value by that part of code. dcontext->last_exit should presumably be some linkstub, but it seems to be the address of the following gencode.

shared_delete_bb_ibl_indjmp:
  0x0000aaaa6aabd054  f9001785   str    %x5 -> +0x28(%x28)[8byte]
  0x0000aaaa6aabd058  f9401f85   ldr    +0x38(%x28)[8byte] -> %x5
  0x0000aaaa6aabd05c  f901aca2   str    %x2 -> +0x0358(%x5)[8byte]
  0x0000aaaa6aabd060  f9401785   ldr    +0x28(%x28)[8byte] -> %x5
  0x0000aaaa6aabd064  f9400b82   ldr    +0x10(%x28)[8byte] -> %x2
  0x0000aaaa6aabd068  f9402381   ldr    +0x40(%x28)[8byte] -> %x1
  0x0000aaaa6aabd06c  d61f0020   br     %x1
...
shared_bb_far_unlinked_indjmp:
  0x0000aaaa6aabd074  17fffff8   b      $0x0000aaaa6aabd054
abhinav92003 commented 3 years ago

It seems that with a high shared_ibt_table_bb_init, the buggy path seems to be avoided.

With a low value of shared_ibt_table_bb_init, hashtable_ibl_resized_custom gets invoked to increase the size of ibt. During hashtable_ibl_resized_custom, safely_nullify_tables sets the payload of all entries to the target_delete routine to induce a cache exit. This routine is supposed to set dcontext->next_tag to the indirect branch target. But it doesn't seem to be WAI.

derekbruening commented 3 years ago

During hashtable_ibl_resized_custom, safely_nullify_tables sets the payload of all entries to the target_delete routine to induce a cache exit. This routine is supposed to set dcontext->next_tag to the indirect branch target. But it doesn't seem to be WAI.

Is this a regression due to PR #4638 and the original version of the IBL was correct?

abhinav92003 commented 3 years ago

Is this a regression due to PR #4638 and the original version of the IBL was correct?

No, it is not a regression. The issue existed even before that PR. Enabling ibl-stress in #4638 helped discover the issue.

derekbruening commented 3 years ago

Is this a regression due to PR #4638 and the original version of the IBL was correct?

No, it is not a regression. The issue existed even before that PR. Enabling ibl-stress in #4638 helped discover the issue.

I'm looking at the version before #4638 and it seems very similar to the first part of the patch here: i.e., this seems in some part to be restoring what was there before, at least at first glance. Have not had time to understand all the details yet though.

abhinav92003 commented 3 years ago

A quick summary of changes in the IBL hit path: Before #4638:

ldp x0, x2, [x28]
str x0, [dr_reg_stolen, TLS_REG1_SLOT] # x0 restored later from TLS_REG1_SLOT by fragment prefix
ldr x0, [x1, #start_pc_fragment_offset]
mov x1, x2                             # restore x1's app value
ldr x2, [dr_reg_stolen, TLS_REG2_SLOT] # restore x2's app value
br x0

After #4638: Removed restore for x1, and store of x0 to tls slot. Fragment prefix instead will restore those now.

ldr x2, [dr_reg_stolen, TLS_REG2_SLOT] # restore x2's app value
ldr x0, [x1, #start_pc_fragment_offset]
br x0

After #4699: Only adds save for indirect branch target value to x1

ldr x0, [x1, #start_pc_fragment_offset]
mov x1, x2                             # save next_tag to x1
ldr x2, [dr_reg_stolen, TLS_REG2_SLOT] # restore x2's app value
br x0

The mov x1, x2 instruction has come back, but it fulfils a different purpose than before.

abhinav92003 commented 3 years ago

Also, I ran ibl-stress on a checkout before #4638 (eb57ba95), and it fails without the -shared_ibt_table_bb_init 16 workaround, with the same error.

$ dynamorio/bin64/runstats "-s" "90" "-killpg" "-silent" "-env" "LD_LIBRARY_PATH" "dynamorio/lib64/debug:dynamorio/ext/lib64/debug:" "-env" "DYNAMORIO_OPTIONS" "-dumpcore_mask 0 -disable_traces -shared_bb_ibt_tables -checklevel 0 -code_api" "dynamorio/suite/tests/bin/api.ibl-stress"
<Starting application dynamorio/suite/tests/bin/api.ibl-stress (22868)>
<Initial options = -checklevel 0 -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -shared_bb_ibt_tables -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<pre-building code cache from 7001 tags>
<all_memory_areas is missing regions including 0x0000fffd964a6000-0x0000fffd964fa000>
<Application tried to execute from unreadable memory 0xf9402381f9400b82.
This may be a result of an unsuccessful attack or a potential application vulnerability.>
<Application dynamorio/suite/tests/bin/api.ibl-stress (22868).  Application exception at PC 0xf9402381f9400b82.  
Signal 11 delivered to application as default action.
Callstack:
    0xf9402381f9400b82  
    0x0000aaaabf937888   <dynamorio/suite/tests/bin/api.ibl-stress+0x3888>
    0x0000ffff96abb0a0   </lib/aarch64-linux-gnu/libpthread-2.24.so+0x70a0>
    0x0000ffff965c4eac   </lib/aarch64-linux-gnu/libc-2.24.so+0xc7eac>
>
<Stopping application dynamorio/suite/tests/bin/api.ibl-stress (22868)>