WriteXorExecute double mapping causes .NET apps to AV when native debuggers set breakpoints

noahfalk commented 2 months ago

Description

Using windbg (or VS) to set breakpoints on jitted code causes applications to hit AVs. Running the same app under the debugger without setting breakpoints runs correctly. The underlying issue appears to be some failure in the double memory mapping used by WriteXorExecute when a debugger modifies the RX portion of the address space while setting breakpoints.

Reproduction Steps

Build a test app with this code.

internal class Program
{
static void Main()
{
    Action action = A;
    action += B;
    action();
}
static void A()
{
    Console.WriteLine("A");
}
static void B()
{
    Console.WriteLine("B");
}
}

Run the app with windbg as the debugger
When windbg initially breaks in run command "sxe ld coreclr" and continue.
When windbg stops again run command "!bpmd ConsoleApp23 Program.Main" and continue. Replace 'ConsoleApp23 with whatever executable name your test app compiled as.
When the debugger stops at Program.Main, continue once more.

Expected behavior

The app should run to completion successfully

Actual behavior

The app will crash with an AV. The AV callstack looks like this:

00 000000cd`4237e888 00007ffa`3551924c     0x00007ff9`db6b4720
01 000000cd`4237e890 00007ff9`db6b4b07     System_Private_CoreLib!System.MulticastDelegate.CombineImpl+0x5c
02 000000cd`4237e990 00007ffa`3b2959b3     ConsoleApp23!COM+_Entry_Point
...

The AV occurs because 0x00007ff9`db6b4720 points to zeroed memory.

00007ff9`db6b4720 0000            add     byte ptr [rax],al
00007ff9`db6b4722 0000            add     byte ptr [rax],al
00007ff9`db6b4724 0000            add     byte ptr [rax],al
00007ff9`db6b4726 0000            add     byte ptr [rax],al
00007ff9`db6b4728 0000            add     byte ptr [rax],al

Regression?

Unknown, but I wouldn't be surprised if the issue was introduced when WriteXorExecute introduced double mapped memory

Known Workarounds

I assume disabling W^X feature would do it but I haven't verified that specifically. Debugging is possible if you carefully avoid placing any breakpoints in the RX memory regions that are double mapped.

Configuration

I reproed this on 9.0.0-preview.7.24405.7, x64, Windows

Other information

I did some debugging into it and here is what I've seen so far.

The code that consisted of zeroed bytes and triggered the AV would instead be assembly that looks like this if you don't set the breakpoint on Main():

0000017f`fd6a4720 488bd1          mov     rdx,rcx
0000017f`fd6a4723 48b990228ce1f97f0000 mov rcx,7FF9E18C2290h
0000017f`fd6a472d e9b6cdf6ff      jmp     0000017f`fd6114e8

This assembly code is generated by DynamicHelpers::CreateHelperArgMove:

00 00000071`7d17e3b8 00007ffa`41460705     coreclr!DynamicHelpers::CreateHelperArgMove
01 00000071`7d17e3c0 00007ffa`4145fdb0     coreclr!DynamicHelperFixup+0x6f9
02 00000071`7d17e7c0 00007ffa`41536e7a     coreclr!DynamicHelperWorker+0x130
03 00000071`7d17e8c0 00007ffa`3551924c     coreclr!DelayLoad_Helper_Obj+0x7a
04 00000071`7d17e980 00007ff9`e1944b06     System_Private_CoreLib+0x29924c
05 00000071`7d17ea80 00007ffa`415359b3     ConsoleApp23!ConsoleApp23.Program.Main+0x126

PCODE DynamicHelpers::CreateHelperArgMove(LoaderAllocator * pAllocator, TADDR arg, PCODE target)
{
    BEGIN_DYNAMIC_HELPER_EMIT(18);

#ifdef UNIX_AMD64_ABI
    *p++ = 0x48; // mov rsi, rdi
    *(UINT16 *)p = 0xF78B;
#else
    *p++ = 0x48; // mov rdx, rcx
    *(UINT16 *)p = 0xD18B;
#endif
    p += 2;

#ifdef UNIX_AMD64_ABI
    *(UINT16 *)p = 0xBF48; // mov rdi, XXXXXX
#else
    *(UINT16 *)p = 0xB948; // mov rcx, XXXXXX
#endif
    p += 2;
    *(TADDR *)p = arg;
    p += 8;

    *p++ = X86_INSTR_JMP_REL32; // jmp rel32
    *(INT32 *)p = rel32UsingJumpStub((INT32 *)(p + rxOffset), target, NULL, pAllocator);
    p += 4;

    END_DYNAMIC_HELPER_EMIT();
}

After generating the code, execution returns back up to DelayLoad_Helper_Obj which tailcalls the code it just created. If the breakpoint on Main() wasn't set the memory will be correct, if the breakpoint was set it will be zeroed instead.

When CreateHelperArgMove() writes the code bytes into memory, it writes them using a pointer into a memory region with ReadWrite permissions. This memory is supposed to be double mapped causing the same assembly bytes to appear at a different address that has ReadExecute permissions. The tailcall invokes the RX address.

I hypothesized that setting the debugger breakpoint somehow causes the RX page to become unmapped from the RW page and did the following experiment to validate that:

Run the repro code again, however at step (4) use bpmd to set a breakpoint at CombineImpl:
```
!bpmd System.Private.CoreLib.dll System.MulticastDelegate.CombineImpl
```
Continue and hit the breakpoint CombineImpl. Now set a breakpoint at coreclr!DynamicHelpers::CreateHelperArgMove and run to it.
```
bp coreclr!DynamicHelpers::CreateHelperArgMove
```

In CreateHelperArgMove, step through the code into FindRWBlock and note the RW <-> RX mapping:

00 (Inline Function) --------`--------     coreclr!ExecutableAllocator::FindRWBlock+0x28
01 00000032`e27bde50 00007ffa`4145dd54     coreclr!ExecutableAllocator::MapRW+0x7b
02 (Inline Function) --------`--------     coreclr!ExecutableWriterHolderNoLog<unsigned char>::{ctor}+0x1a
03 00000032`e27bdec0 00007ffa`41460705     coreclr!DynamicHelpers::CreateHelperArgMove+0x60

0:000> ?? pBlock
struct ExecutableAllocator::BlockRW * 0x00000216`b12b52d0
+0x000 next             : (null) 
+0x008 baseRW           : 0x00000216`b2bb0000 Void
+0x010 baseRX           : 0x00007ff9`e1940000 Void
+0x018 size             : 0x10000
+0x020 refCount         : 1

Continue stepping back out to DynamicHelpers::CreateHelperArgMove and step through a portion of generating the assembly.
Run !bpmd Program.Main to make windbg set the breakpoint at Main(). Notice that the address of Main() falls in the same RX region used for the assembly stub:
```
Setting breakpoint: bp 00007FF9E1944A0B [ConsoleApp23.Program.Main()]
```
Continue stepping in DynamicHelpers::CreateHelperArgMove to generate the rest of the assembly.

Now you can observe in the disassembly view that the RW portion of memory contains all three instructions:

00000216`b2bb4720 488bd1          mov     rdx,rcx
00000216`b2bb4723 48b9902259fbf97f0000 mov rcx,7FF9FB592290h
00000216`b2bb472d e9b6cdf6ff      jmp     000001be`cb4b14e8

But the RX view of the memory only contains the instructions that were written before the breakpoint at Main() was set:

0:000> u 0x00007ff9`fb614720
00007ff9`fb614720 488bd1          mov     rdx,rcx
00007ff9`fb614723 48b9902259fbf97f0000 mov rcx,7FF9FB592290h
00007ff9`fb61472d 0000            add     byte ptr [rax],al
00007ff9`fb61472f 0000            add     byte ptr [rax],al

noahfalk commented 2 months ago

@janvorli - is this behavior you've seen before or any thoughts what might be the underlying cause? I assume windbg (or VS) are calling WriteVirtualMemory to modify the memory region that is RX in-proc, but I would have expected the write to get mirrored through mapping rather than breaking the mapping.

noahfalk commented 2 months ago

fyi @mikelle-rogers @dotnet/dotnet-diag

dotnet-policy-service[bot] commented 2 months ago

Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.

janvorli commented 2 months ago

Thanks Noah for the first level of investigations! I'll take a look. I remember in the past when we've enabled W^X, WinDbg was unable to install breakpoints to the double mapped executable memory at all, because their code was refusing to do that in file (paging file in our case) mapped memory. My guts feeling is that the cause of the current issue is that somehow they fixed it by changing the mapping to not to be file mapped, which breaks the double mapping. But let me investigate it further.

janvorli commented 2 months ago

@noahfalk It seems the problem is fully on the WinDbg side. I have also disabled caching of RW mappings in the runtime, so each time a RW mapping is requested for an existing RX address, a new call to this function is made, so the OS provides a brand new mapping: https://github.com/dotnet/runtime/blob/54280788590d6012758302d4056aa45720133be2/src/coreclr/minipal/Windows/doublemapping.cpp#L200-L207 The offset it uses is the same offset that the RX mapping is using and yet the changes made through the new RW mapping are not visible through the RX one.

I am not sure what does WinDbg use to set the breakpoint, so I don't have any idea yet on how it could break the double mapping. I wonder if that could somehow turn on copy on write and if that would end up allocating and mapping a different physical page for the region where the breakpoint was placed.

noahfalk commented 2 months ago

@noahfalk It seems the problem is fully on the WinDbg side

Interesting. @mikelle-rogers mentioned to me that she saw the same behavior repro when using VS debugger but when I tried just now I couldn't get it to repro. Originally I assumed this was something generic any debugger was hitting because of that, but now I've got conflicting info. Let me reach out to some windbg folks to get their thoughts.

mikelle-rogers commented 1 month ago

@noahfalk, which version of Visual Studio were you using?

noahfalk commented 1 month ago

VS Version 17.12.0 Preview 1.0

noahfalk commented 1 month ago

After some further discussion and investigation it looks like Windbg is changing the VM page to copy-on-write as part of writing a breakpoint. This disconnects the RX view of the page from the underlying memory mapping. Windbg team is planning to adjust their breakpoint setting logic to detect these pages and avoid making them copy-on-write.

dotnet / runtime