yjugl commented 3 weeks ago

ntdll

ntdll.dll 10.0.26100.2161 from Windows 11 24H2 26100.2161 differs from previous versions of ntdll.dll in a subtle way. [edit: In fact all Windows 11 24H2 since 26100.1 have this difference -- thank you Gov Maharaj for figuring this out.]

Previously RtlDispatchException would almost directly reach into the vectored exception handlers, but now two buffers of respective sizes 0x58 and 0xD8 are memset to zero before reaching into the vectored exception handlers:

// 10.0.26100.2161
    ntdll!RtlDispatchException:
push    rbp
push    rsi
push    rdi
push    r12
push    r13
push    r14
push    r15
sub     rsp, 210h
lea     rbp, [rsp+60h]
mov     qword ptr [rbp+200h], rbx
mov     rax, qword ptr [ntdll!__security_cookie]
xor     rax, rbp
mov     qword ptr [rbp+1A0h], rax
xor     esi, esi
mov     r15, rdx
mov     rdi, rcx
mov     dword ptr [rbp+20h], esi
xor     edx, edx
lea     rcx, [rbp+50h]
lea     r8d, [rsi+50h]
  // memset(foo, 0, 0x50)
call    ntdll!memset$thunk$772440563353939046
xor     edx, edx
mov     byte ptr [rbp], sil
mov     r8d, 0D8h
mov     qword ptr [rbp+8], rsi
lea     rcx, [rbp+0C0h]
mov     qword ptr [rbp+10h], rsi
mov     qword ptr [rbp+18h], rsi
mov     qword ptr [rbp+48h], rsi
mov     qword ptr [rbp+28h], rsi
mov     qword ptr [rbp+40h], rsi
  // memset(bar, 0, 0xD8)
call    ntdll!memset$thunk$772440563353939046
mov     rax, qword ptr gs:[60h]
test    dword ptr [rax+0BCh], 800000h
je      ntdll!RtlDispatchException+0xa0
cmp     qword ptr [ntdll!RtlpExceptionLog2], rsi
mov     byte ptr [rbp], 1
jne     ntdll!RtlDispatchException+0x609
xor     r8d, r8d
mov     rdx, r15
mov     rcx, rdi
call    ntdll!RtlpCallVectoredHandlers

Now let me detail why we care here.

memset interception with ASAN

When compiler-rt ASAN instrumentation is in place, memset is replaced for instrumentation purposes. So any memset will go through:

#define ASAN_MEMSET_IMPL(ctx, block, c, size) \
  do {                                        \
    if (LIKELY(replace_intrin_cached)) {      \
      ASAN_WRITE_RANGE(ctx, block, size);     \
    } else if (UNLIKELY(!AsanInited())) {     \
      return internal_memset(block, c, size); \
    }                                         \
    return REAL(memset)(block, c, size);      \
  } while (0)

#define ASAN_WRITE_RANGE(ctx, offset, size) \
  ACCESS_MEMORY_RANGE(ctx, offset, size, true)

#define ACCESS_MEMORY_RANGE(ctx, offset, size, isWrite)                   \
  do {                                                                    \
    uptr __offset = (uptr)(offset);                                       \
    uptr __size = (uptr)(size);                                           \
    uptr __bad = 0;                                                       \
    if (UNLIKELY(__offset > __offset + __size)) {                         \
      GET_STACK_TRACE_FATAL_HERE;                                         \
      ReportStringFunctionSizeOverflow(__offset, __size, &stack);         \
    }                                                                     \
    if (UNLIKELY(!QuickCheckForUnpoisonedRegion(__offset, __size)) &&     \
        (__bad = __asan_region_is_poisoned(__offset, __size))) {          \
      AsanInterceptorContext *_ctx = (AsanInterceptorContext *)ctx;       \
      bool suppressed = false;                                            \
      if (_ctx) {                                                         \
        suppressed = IsInterceptorSuppressed(_ctx->interceptor_name);     \
        if (!suppressed && HaveStackTraceBasedSuppressions()) {           \
          GET_STACK_TRACE_FATAL_HERE;                                     \
          suppressed = IsStackTraceSuppressed(&stack);                    \
        }                                                                 \
      }                                                                   \
      if (!suppressed) {                                                  \
        GET_CURRENT_PC_BP_SP;                                             \
        ReportGenericError(pc, bp, sp, __bad, isWrite, __size, 0, false); \
      }                                                                   \
    }                                                                     \
  } while (0)

In particular, __asan_region_is_poisoned will access the shadow memory corresponding to the region that we memset, in order to check if the region is poisoned.

Shadow memory lazy commit on Win64

On Win64, shadow memory pages are first allocated as MEM_RESERVE. They are dynamically turned to MEM_COMMIT on demand -- meaning that we rely on an exception handler ShadowExceptionHandler to change the status of the page when we fail to access a reserved shadow memory page because it is not yet commited. This change was pushed by this revision.

Putting it together

The Win64 memset interception in ASAN is incompatible with ntdll 10.0.26100.2161. As soon as a first access violation gets raised because a shadow memory page is reserved but not committed, we immediately reach a call to memset before we get a chance to reach the ShadowExceptionHandler. This call to memset itself triggers a new access violation and a new call to memset, etc. This is a neverending cycle, until eventually we overflow the stack.

 # Child-SP          RetAddr               Call Site
00 0000003e`f3000fa0 00007ffb`03adf0da     clang_rt_asan_dynamic_x86_64!__asan_wrap_memset+0x18e [/builds/worker/fetches/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc @ 87] 
01 0000003e`f3001830 00007ffb`03c236de     ntdll!RtlDispatchException+0x4a
02 0000003e`f3001a80 00007ffa`8c4c8632     ntdll!KiUserExceptionDispatch+0x2e
03 (Inline Function) --------`--------     clang_rt_asan_dynamic_x86_64!__asan::AddressIsPoisoned+0xe [/builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_mapping.h @ 395] 
04 0000003e`f3002180 00007ffa`8c4c56a3     clang_rt_asan_dynamic_x86_64!__asan_region_is_poisoned+0xf2 [/builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_poisoning.cpp @ 189] 
05 0000003e`f30021e0 00007ffb`03adf0da     clang_rt_asan_dynamic_x86_64!__asan_wrap_memset+0x193 [/builds/worker/fetches/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc @ 87] 
06 0000003e`f3002a70 00007ffb`03c236de     ntdll!RtlDispatchException+0x4a
07 0000003e`f3002cc0 00007ffa`8c4c8632     ntdll!KiUserExceptionDispatch+0x2e
08 (Inline Function) --------`--------     clang_rt_asan_dynamic_x86_64!__asan::AddressIsPoisoned+0xe [/builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_mapping.h @ 395] 
09 0000003e`f30033c0 00007ffa`8c4c56a3     clang_rt_asan_dynamic_x86_64!__asan_region_is_poisoned+0xf2 [/builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_poisoning.cpp @ 189] 
// ...

Related Firefox bug here.

yjugl commented 3 weeks ago

cc @rnk and @bergeret who reviewed the ShadowExceptionHandler patch

rnk commented 3 weeks ago

Hans (@zmodem) has been tackling a fair number of these issues recently.

Can we deescalate the war of interception? What happens if we stop intercepting C string routines in ntdll? I'm sure we have some critical ntdll interceptors, but memset, str* and others are mostly bug-finding interceptors, not functional allocator rerouting interceptors. To me it is very clear that we should trade away some false negatives in third party code to improve ASan reliability and robustness to Windows updates.

In general, this category of interceptor re-entrancy is a common sanitizer failure mode on all platforms, and why ASan avoids "libc" APIs on other platforms. It only happens that on Windows, "libc" APIs are usually layered above win32 APIs, but that assumption has been violated after this change because the EH machinery (RtlDispatchException) calls memset.

mstorsjo commented 3 weeks ago

CC @barcharcraz who works on the sanitizers at MS, and who is upstreaming their currently downstream tweaks.

yjugl commented 3 weeks ago

The title and the first comment are misleading. I have only tested with 26100.2161, but Gov Maharaj from Microsoft correctly figured that these changes are in fact present in all released versions of Windows 11 24H2 so far (starting with 26100.1). Updating accordingly.

zmodem commented 3 weeks ago

What happens if we stop intercepting C string routines in ntdll?

I'm not even sure that the current behavior was intentional? When intercepting a function, the asan runtime essentially looks for and intercepts all instances it can find. I'm guessing that it looks in ntdll because we want to intercept certain functions there, and ntdll's crt functions just got sucked into the general interception machinery.

If we stop intercepting those functions, I suppose we might lose some checks on win32 api functions though. For example, assuming that SetWindowText uses ntdll's strlen internally (I don't know if it does), today asan would flag if the string arg is poisoned memory. Not sure if this counts as a significant loss though.

Not intercepting the ntdll "minicrt" functions sounds reasonable to me, I'll take a look.

yjugl commented 3 weeks ago

Noting here that this patch, applied on top of the two patches from #111638, is enough to get an ASAN build of Firefox running. It works by making the memset instrumentation force-commit shadow memory before looking at it. This patch may not be the ideal solution to the problem, because any call to memset will now produce a call to VirtualAlloc even when the shadow memory is already commited. Still thought it would be worth sharing in case it can help unstuck a situation for someone.

Edit: Link expired -- replaced by a link to Phabricator.

llvm / llvm-project

memset interception in compiler-rt asan is incompatible with ntdll.dll from Windows 11 24H2 #114793

ntdll

memset interception with ASAN

Shadow memory lazy commit on Win64

Putting it together