dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.9k stars 4.63k forks source link

ExecutionEngineException when using step-by-step #63199

Closed Kuinox closed 2 years ago

Kuinox commented 2 years ago

Description

When trying to find the source of an "CancelledOperationException" being thrown in background in my app, I stumbled on a weird bug.
When I step in a specific place, it will throw a ExecutionEngineException, and if I try to continue the program execution it will freeze VS2022 (and I must kill it with taskmanager).
As I don't know where the issue come from, I created the issue here.

Reproduction Steps

Clone my repo https://github.com/Kuinox/CK-LogViewer/commit/ac8a94534f510e83c52401cb267a46bd873c94e4 and checkout on the commit linked (it was the develop branch when this issue was opened).

Run LogSampleGenerator until it hit the Debugger.Break().
Now due to this issue: https://github.com/dotnet/sdk/issues/1458, 3 pdb must be loaded by hand in VS, they are located in the nuget cache:
%userprofile%\.nuget\packages\ck.mqtt.abstractions\0.10.0\lib\net6.0 %userprofile%\.nuget\packages\ck.mqtt.common\0.10.0\lib\net6.0 %userprofile%\.nuget\packages\ck.mqtt.client\0.10.0\lib\net6.0 Now navigate to

CK.MQTT.Client.dll!CK.MQTT.SmallOutgoingApplicationMessageExtensions.PublishAsync(CK.MQTT.IMqtt3Client client, CK.Core.IActivityMonitor m, string topic, CK.MQTT.QualityOfService qos, bool retain, System.ReadOnlyMemory<byte> payload) Line 13    C#
CK.Monitoring.MQTT.dll!CK.Monitoring.MQTT.MQTT.DoHandleAsync(CK.Core.IActivityMonitor m, CK.Monitoring.IMulticastLogEntry entry) Line 29    C#

image Now trying to step in SendPacketAsync throw the ExecutionEngineException.
Pressing F5/F10 here lead to VS freezing.

Expected behavior

Being able to debug my program.

Actual behavior

An ExecutionEngineException is throwed.

Regression?

I don't know, new app/code.

Known Workarounds

No response

Configuration

.NET 6.0.1 Windows 11/10

Other information

No response

ghost commented 2 years ago

Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.

Issue Details
### Description When trying to find the source of an "CancelledOperationException" being thrown in background in my app, I stumbled on a weird bug. When I step in a specific place, it will throw a `ExecutionEngineException`, and if I try to continue the program execution it will freeze VS2022 (and I must kill it with taskmanager). As I don't know where the issue come from, I created the issue here. ### Reproduction Steps Clone my repo https://github.com/Kuinox/CK-LogViewer/commit/ac8a94534f510e83c52401cb267a46bd873c94e4 and checkout on the commit linked (it was the develop branch when this issue was opened). Run `LogSampleGenerator` until it hit the `Debugger.Break()`. Now due to this issue: https://github.com/dotnet/sdk/issues/1458, 3 pdb must be loaded by hand in VS, they are located in the nuget cache: `%userprofile%\.nuget\packages\ck.mqtt.abstractions\0.10.0\lib\net6.0` `%userprofile%\.nuget\packages\ck.mqtt.common\0.10.0\lib\net6.0` `%userprofile%\.nuget\packages\ck.mqtt.client\0.10.0\lib\net6.0` Now navigate to ``` CK.MQTT.Client.dll!CK.MQTT.SmallOutgoingApplicationMessageExtensions.PublishAsync(CK.MQTT.IMqtt3Client client, CK.Core.IActivityMonitor m, string topic, CK.MQTT.QualityOfService qos, bool retain, System.ReadOnlyMemory payload) Line 13 C# CK.Monitoring.MQTT.dll!CK.Monitoring.MQTT.MQTT.DoHandleAsync(CK.Core.IActivityMonitor m, CK.Monitoring.IMulticastLogEntry entry) Line 29 C# ``` ![image](https://user-images.githubusercontent.com/18743295/147698390-12aa9fa9-a190-4ea0-91d7-8c6d0e5ffdae.png) Now trying to step in `SendPacketAsync` throw the `ExecutionEngineException`. Pressing F5/F10 here lead to VS freezing. ### Expected behavior Being able to debug my program. ### Actual behavior An ExecutionEngineException is throwed. ### Regression? I don't know, new app/code. ### Known Workarounds _No response_ ### Configuration .NET 6.0.1 Windows 11/10 ### Other information _No response_
Author: Kuinox
Assignees: -
Labels: `area-Diagnostics-coreclr`, `untriaged`
Milestone: -
Kuinox commented 2 years ago

Tell me if you successfully reproduced it or if you want that I make a more minimal reproduction.

tommcdon commented 2 years ago

HI @Kuinox I'm not able to reproduce the issue. Under Debug->Options, I have Just My Code disabled and Suppress JIT optimization on module load enabled. Without those settings I was not able to step into the SendPacketAsync function. I am seeing that an InvalidOperationException is being thrown from that function: image If you are still able to reproduce the issue, would you mind capturing a dump and uploading it through a VS Feedback Ticket (to protect your privacy)? Once the feedback ticket is created, please paste a link to in in this Github issue and we'll be able to find it.

tommcdon commented 2 years ago

@Kuinox I hit send too soon above. To collect a dump, please choose Debug->Save Dump As in VS. image

Kuinox commented 2 years ago

I am seeing that an InvalidOperationException is being thrown from that function:

Oh sorry ! I forgot that it had to connect to a mosquitto server running on localhost :(.
Looks like I can't create a dump. image image

Kuinox commented 2 years ago

I created a dump using dotnet-dump cli tool and made a ticket on VS Feedback, sadly it looks like the ticket & dump file is public and I can't find a way to make it not public :/ https://developercommunity.visualstudio.com/t/ExecutionEngineException-when-using-step/1626148

danmoseley commented 2 years ago

@Kuinox I've sent out an urgent request internally to try to get it deleted. I'm not sure why it's public -- if I remember a correctly there's a checkbox that makes it public.

Kuinox commented 2 years ago

@Kuinox I've sent out an urgent request internally to try to get it deleted. I'm not sure why it's public -- if I remember a correctly there's a checkbox that makes it public.

Thanks ! I don't think there is anything important as it's not my work machine, but it's better be safe.

I was not sure it would be public, and didn't found any checkbox to make it public or private, in fact there is even a "secured" (what does it mean ?) indicator next to the form: image

image

tommcdon commented 2 years ago

Thanks @danmoseley! @Kuinox I've downloaded the dump so it's safe to delete the uploaded files.

Kuinox commented 2 years ago

Thanks @danmoseley! @Kuinox I've downloaded the dump so it's safe to delete the uploaded files.

Ok, I edited the ticket to remove the link to the uploaded dump.

tommcdon commented 2 years ago

@Kuinox Thanks again for the dump. This issue will require more investigation. Here's my rough notes so far. It looks like the debugger hit a breakpoint and it working to determine where the stub leads to, and in that process it hit an Access Violation (AV). AV's in the runtime are translated to ExecutionEngineException's. Is this failing deterministically for you? For us to understand this better we can provide you with an instrumented runtime that will provide better logging which should help us understand the root cause better.

0:035> dx -r1 ((ntdll!_EXCEPTION_RECORD *)0x3dc1b7b7f0)
((ntdll!_EXCEPTION_RECORD *)0x3dc1b7b7f0)                 : 0x3dc1b7b7f0 [Type: _EXCEPTION_RECORD *]
    [+0x000] ExceptionCode    : -1073741819 [Type: long]   <--- *** 0xc0000005  STATUS_ACCESS_VIOLATION ***
    [+0x004] ExceptionFlags   : 0x0 [Type: unsigned long]
    [+0x008] ExceptionRecord  : 0x0 [Type: _EXCEPTION_RECORD *]
    [+0x010] ExceptionAddress : 0x7ffb48118e04 [Type: void *]
    [+0x018] NumberParameters : 0x2 [Type: unsigned long]
    [+0x020] ExceptionInformation [Type: unsigned __int64 [15]]
0:035> .cxr 0x0000003d`c1b7b300
rax=0000003dc1b7ba40 rbx=0000003dc1b7bbd0 rcx=000001b0e484f1d0
rdx=0000000000000130 rsi=00007ffae9038da0 rdi=0000000000000000
rip=00007ffb48118e04 rsp=0000003dc1b7ba10 rbp=0000003dc1b7ba70
 r8=000001aad4e19b10  r9=00007ffae91229e8 r10=00007ffae9038da0
r11=00007ffae8fe8950 r12=0000000000000002 r13=0000000000000000
r14=00007ffa4f977cc5 r15=000001b0e45c5ae0
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
coreclr!ILStubManager::GetCOMTarget+0x74:
00007ffb`48118e04 498b5608        mov     rdx,qword ptr [r14+8] ds:00007ffa`4f977ccd=????????????????
:035> k
  *** Stack trace for last set context - .thread/.cxr resets it
 # Child-SP          RetAddr               Call Site
00 0000003d`c1b7ba10 00007ffb`4811963d     coreclr!ILStubManager::GetCOMTarget+0x74 [D:\a\_work\1\s\src\coreclr\vm\stubmgr.cpp @ 1654] 
01 0000003d`c1b7ba90 00007ffb`480e7bb1     coreclr!ILStubManager::TraceManager+0x11d [D:\a\_work\1\s\src\coreclr\vm\stubmgr.cpp @ 1797] 
02 0000003d`c1b7bac0 00007ffb`482295b0     coreclr!EEDbgInterfaceImpl::TraceManager+0x91 [D:\a\_work\1\s\src\coreclr\vm\eedbginterfaceimpl.cpp @ 1261] 
03 0000003d`c1b7bb70 00007ffb`48227216     coreclr!DebuggerStepper::TriggerPatch+0x160 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 7135] 
04 0000003d`c1b7d9e0 00007ffb`48225381     coreclr!DebuggerController::ScanForTriggers+0x1ba [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 2662] 
05 0000003d`c1b7da60 00007ffb`480cea65     coreclr!DebuggerController::DispatchPatchOrSingleStep+0xd1 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 2957] 
06 0000003d`c1b7db40 00007ffb`4803831e     coreclr!DebuggerController::DispatchNativeException+0x96739 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 4299] 
07 0000003d`c1b7db90 00007ffb`48017998     coreclr!Debugger::FirstChanceNativeException+0x5e [D:\a\_work\1\s\src\coreclr\debug\ee\debugger.cpp @ 5574] 
08 0000003d`c1b7dbe0 00007ffb`4801788d     coreclr!IsDebuggerFault+0x6c [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 6592] 
09 0000003d`c1b7dc20 00007ffb`480177d6     coreclr!CLRVectoredExceptionHandlerPhase2+0x91 [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 7361] 
0a 0000003d`c1b7dc80 00007ffb`480176da     coreclr!CLRVectoredExceptionHandler+0xc2 [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 7277] 
0b 0000003d`c1b7dd40 00007ffc`2247bfea     coreclr!CLRVectoredExceptionHandlerShim+0xea [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 8016] 
0c 0000003d`c1b7dd90 00007ffc`22435812     ntdll!RtlpCallVectoredHandlers+0x112 [minkernel\ntdll\vectxcpt.c @ 204] 
0d (Inline Function) --------`--------     ntdll!RtlCallVectoredExceptionHandlers+0xe [minkernel\ntdll\vectxcpt.c @ 358] 
0e 0000003d`c1b7de30 00007ffc`224a730e     ntdll!RtlDispatchException+0x62 [minkernel\ntos\rtl\amd64\exdsptch.c @ 390] 
0f 0000003d`c1b7e080 00007ffa`e9038da0     ntdll!KiUserExceptionDispatch+0x2e [minkernel\ntos\rtl\amd64\trampoln.asm @ 751] 
10 0000003d`c1b7e7b8 00007ffa`e9037c33     CK_MQTT_Client!ILStubClass.IL_STUB_InstantiatingStub(CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket)
11 0000003d`c1b7e7c0 00007ffa`e9036fef     CK_MQTT_Client!CK.MQTT.SmallOutgoingApplicationMessageExtensions+<PublishAsync>d__0.MoveNext()+0x163

image

0:035> .frame 1
01 0000003d`c1b7ba90 00007ffb`480e7bb1     coreclr!ILStubManager::TraceManager+0x11d [D:\a\_work\1\s\src\coreclr\vm\stubmgr.cpp @ 1797] 
0:035> dv
               this = 0x00000000`00000000
             thread = 0x000001aa`d49f0df0
              trace = 0x0000003d`c1b7bbd0
           pContext = <value unavailable>
           pRetAddr = 0x0000003d`c1b7bbb8
            pStubMD = 0x00007ffa`e91229e8
                arg = 0x130  <--- *** This argument should be a methoddesc and looks incorrect ***
             target = 0x0000003d`c1b7ba40
              pThis = 0x000001aa`d4e19b10
             stubIP = 0x00007ffa`e9038da0
      delegateCount = <value unavailable>
pbDelInvocationList = <value unavailable>
00007ffa`e9037bf9 488b4d90        mov     rcx,qword ptr [rbp-70h]
00007ffa`e9037bfd 48ba78a0fae8fa7f0000 mov rdx,7FFAE8FAA078h (MT: CK.MQTT.IMqtt3Client)
00007ffa`e9037c07 49b8082012e9fa7f0000 mov r8,7FFAE9122008h (MD: CK.MQTT.IMqtt3Client.SendPacketAsync[[System.Object, System.Private.CoreLib]](CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket))
00007ffa`e9037c11 e86a4bee5e      call    coreclr!JIT_VirtualFunctionPointer (00007ffb`47f1c780)
00007ffa`e9037c16 48894580        mov     qword ptr [rbp-80h],rax
00007ffa`e9037c1a 488d9568ffffff  lea     rdx,[rbp-98h]
00007ffa`e9037c21 488b4d90        mov     rcx,qword ptr [rbp-70h]
00007ffa`e9037c25 4c8b4588        mov     r8,qword ptr [rbp-78h]
00007ffa`e9037c29 4c8b4d98        mov     r9,qword ptr [rbp-68h]
00007ffa`e9037c2d 488b4580        mov     rax,qword ptr [rbp-80h]
00007ffa`e9037c31 ffd0            call    rax
Kuinox commented 2 years ago

Is this failing deterministically for you?

Yes more or less, when VS doesn't simply freeze while stepping into this code, it reproduce the exception everytime.

Kuinox commented 2 years ago

we can provide you with an instrumented runtime that will provide better logging which should help us understand the root cause better.

Yes, what should I do ?

tommcdon commented 2 years ago

@Kuinox This item is on our backlog but we don't have time right now to work on this. In the meantime, please try working around the issue by manually setting breakpoints on target methods. instead of stepping into the method causes the app/VS to freeze.

Kuinox commented 2 years ago

Ok ok, thanks for the update.

noahfalk commented 2 years ago

@davidwrighton @AaronRobinsonMSFT @jkoritzinsky - this bug appears to be a long-standing issue with IL stubs, I'm not sure who the owner of that is these days? The short version is that ILStubManager::TraceManager doesn't appear to correctly handle all the ILStubs that the runtime creates. A few years ago we encountered a very similar problem with a different type of ILStub that this method didn't handle so this appears to be a repeating problem.

Investigating the dump that @Kuinox sent us, this is the long version of what I think likely has happened:

  1. The debugger is initially stopped in the PublishAsync() method shown above and then we do a step-in operation.
  2. In order to do the step operation the debugger repeatedly predicts forward in the disassembly to determine where execution will go, sets a breakpoint, and then runs to that point. If the scenario had worked properly the debugger would eventually determine that it had executed far enough and break out of the loop, but we don't get that far in this case.
  3. While still in that loop the debugger executes to IP 00007ffae9037c31 and tries to determine where execution will go next.
// this is part of the disassembly for the source code line "return await client.SendPacketAsync<object>(...)
0:035> u 00007ffa`e9037c11
CK_MQTT_Client!CK.MQTT.SmallOutgoingApplicationMessageExtensions+<PublishAsync>d__0.MoveNext()+0x141:
00007ffa`e9037c11 e86a4bee5e      call    coreclr!JIT_VirtualFunctionPointer (00007ffb`47f1c780)
00007ffa`e9037c16 48894580        mov     qword ptr [rbp-80h],rax
00007ffa`e9037c1a 488d9568ffffff  lea     rdx,[rbp-98h]
00007ffa`e9037c21 488b4d90        mov     rcx,qword ptr [rbp-70h]
00007ffa`e9037c25 4c8b4588        mov     r8,qword ptr [rbp-78h]
00007ffa`e9037c29 4c8b4d98        mov     r9,qword ptr [rbp-68h]
00007ffa`e9037c2d 488b4580        mov     rax,qword ptr [rbp-80h]
00007ffa`e9037c31 ffd0            call    rax                  // <--------------- execution reached here

That call at the end will lead to executing the jitted code for the IL instantiating stub for SendPacketAsync(...)

  1. The debugger observes the call opcode and knows the current value of rax, so it invokes TrapStepInHelper to try figuring out where execution will go to and set a breakpoint. This calls TrapStepInto, which calls into StubManager::TraceStub(). The IP in rax points to 00007ffae9038da0, an IL instantiating stub:

    0:035> !ip2md 00007ffa`e9038da0
    MethodDesc:   00007ffae91229e8
    Method Name:          ILStubClass.IL_STUB_InstantiatingStub(CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket)
    Class:                00007ffae8fc03a8
    MethodTable:          00007ffae8fadca8
    mdToken:              0000000006000000
    Module:               00007ffae8fa8180
    IsJitted:             yes
    Current CodeAddr:     00007ffae9038da0
    Version History:
    ILCodeVersion:      0000000000000000
    ReJIT ID:           0
    IL Addr:            0000000000000000
     CodeAddr:           00007ffae9038da0  (MinOptJitted)
     NativeCodeVersion:  0000000000000000
  2. StubManager::TraceStub() determines that the target address belongs to the ILStubManager because ILStubManager::CheckIsStub_Internal claims all addresses that map to MethodDescs where MethodDesc::IsILStub() is TRUE.

  3. StubManager::TraceStub then asks ILStubManager to predict where the stub will execute to by calling ILStubManager::DoTraceStub(). The implementation of this method returns something called "ManagerPush" with the stub start address (00007ffae9038da0). This is intended for cases where the stub manager can't predict the destination of the stub given its IP alone. Instead the StubManager is requesting that the debugger set a breakpoint at the address it returned, then call back to the stub manager later once execution has reached that address. At that point the debugger will be able to provide a complete CONTEXT rather than the IP alone.

  4. StubManager code returns back to the debugger, the debugger sets the breakpoint at 00007ffae9038da0 as requested, and returns back through the vectored exception handler which ultimately resumes execution in the jitted user code at 00007ffae9037c31:

    00007ffa`e9037c31 ffd0            call    rax
  5. The processor executes the call instruction, immediately arriving at 00007ffae9038da0 and triggering the breakpoint placed by the debugger. This generates an exception that calls back into vectored exception handler and works it way back to the debugger code.

    0:035> u
    CK_MQTT_Client!ILStubClass.IL_STUB_InstantiatingStub(CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket):
    00007ffa`e9038da0 cc              int     3     // <--------------- execution has now advanced to here
    00007ffa`e9038da1 4883ec30        sub     rsp,30h
    00007ffa`e9038da5 488d6c2430      lea     rbp,[rsp+30h]
    00007ffa`e9038daa 48894d10        mov     qword ptr [rbp+10h],rcx
    00007ffa`e9038dae 48895518        mov     qword ptr [rbp+18h],rdx
    00007ffa`e9038db2 4c894520        mov     qword ptr [rbp+20h],r8
    00007ffa`e9038db6 4c894d28        mov     qword ptr [rbp+28h],r9
    00007ffa`e9038dba 48b9787603e9fa7f0000 mov rcx,offset CLRStub[MethodDescPrestub]@7ffae9037678 (00007ffa`e9037678)
  6. The debugger recognizes that it placed this breakpoint at the request of the ILStubManager, and now that it has reached the requested address it invokes back to the ILStubManager again wanting a prediction on where execution will go. This time the debugger provides a complete CONTEXT:

0:035> k
...
22 0000003d`c1b7ba90 00007ffb`480e7bb1     coreclr!ILStubManager::TraceManager+0x11d [D:\a\_work\1\s\src\coreclr\vm\stubmgr.cpp @ 1797] 
23 0000003d`c1b7bac0 00007ffb`482295b0     coreclr!EEDbgInterfaceImpl::TraceManager+0x91 [D:\a\_work\1\s\src\coreclr\vm\eedbginterfaceimpl.cpp @ 1261] 
24 0000003d`c1b7bb70 00007ffb`48227216     coreclr!DebuggerStepper::TriggerPatch+0x160 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 7135] 
25 0000003d`c1b7d9e0 00007ffb`48225381     coreclr!DebuggerController::ScanForTriggers+0x1ba [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 2662] 
26 0000003d`c1b7da60 00007ffb`480cea65     coreclr!DebuggerController::DispatchPatchOrSingleStep+0xd1 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 2957] 
27 0000003d`c1b7db40 00007ffb`4803831e     coreclr!DebuggerController::DispatchNativeException+0x96739 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 4299] 
28 0000003d`c1b7db90 00007ffb`48017998     coreclr!Debugger::FirstChanceNativeException+0x5e [D:\a\_work\1\s\src\coreclr\debug\ee\debugger.cpp @ 5574] 
29 0000003d`c1b7dbe0 00007ffb`4801788d     coreclr!IsDebuggerFault+0x6c [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 6592] 
2a 0000003d`c1b7dc20 00007ffb`480177d6     coreclr!CLRVectoredExceptionHandlerPhase2+0x91 [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 7361] 
2b 0000003d`c1b7dc80 00007ffb`480176da     coreclr!CLRVectoredExceptionHandler+0xc2 [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 7277] 
2c 0000003d`c1b7dd40 00007ffc`2247bfea     coreclr!CLRVectoredExceptionHandlerShim+0xea [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 8016] 
2d 0000003d`c1b7dd90 00007ffc`22435812     ntdll!RtlpCallVectoredHandlers+0x112 [minkernel\ntdll\vectxcpt.c @ 204] 
2e (Inline Function) --------`--------     ntdll!RtlCallVectoredExceptionHandlers+0xe [minkernel\ntdll\vectxcpt.c @ 358] 
2f 0000003d`c1b7de30 00007ffc`224a730e     ntdll!RtlDispatchException+0x62 [minkernel\ntos\rtl\amd64\exdsptch.c @ 390] 
30 0000003d`c1b7e080 00007ffa`e9038da0     ntdll!KiUserExceptionDispatch+0x2e [minkernel\ntos\rtl\amd64\trampoln.asm @ 751] 
31 0000003d`c1b7e7b8 00007ffa`e9037c33     CK_MQTT_Client!ILStubClass.IL_STUB_InstantiatingStub(CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket)

// switching to frame 0x23 and showing the CONTEXT provided to the ILStubManager
0:035> dx Debugger.Sessions[0].Processes[28316].Threads[23536].Stack.Frames[35].SwitchTo();dv /t /v
0:035> ?? context
struct _CONTEXT * 0x0000003d`c1b7e080
   +0x000 P1Home           : 0x4545470e
   +0x008 P2Home           : 0x00007ffc`22435249
   +0x010 P3Home           : 0x000001aa`d4e18dc0
   +0x018 P4Home           : 0x0000003d`c1b80000
   +0x020 P5Home           : 0x0000003d`c1b7ea28
   +0x028 P6Home           : 0x0000003d`c1b7ead8
   +0x030 ContextFlags     : 0x10005f
   +0x034 MxCsr            : 0x1f80
   +0x038 SegCs            : 0x33
   +0x03a SegDs            : 0x2b
   +0x03c SegEs            : 0x2b
   +0x03e SegFs            : 0x53
   +0x040 SegGs            : 0x2b
   +0x042 SegSs            : 0x2b
   +0x044 EFlags           : 0x202
   +0x048 Dr0              : 0
   +0x050 Dr1              : 0
   +0x058 Dr2              : 0
   +0x060 Dr3              : 0
   +0x068 Dr6              : 0
   +0x070 Dr7              : 0
   +0x078 Rax              : 0x00007ffa`e9037688
   +0x080 Rcx              : 0x000001aa`d4e19b10
   +0x088 Rdx              : 0x0000003d`c1b7e858
   +0x090 Rbx              : 0x000001aa`d4e18dc0
   +0x098 Rsp              : 0x0000003d`c1b7e7b8
   +0x0a0 Rbp              : 0x0000003d`c1b7e8f0
   +0x0a8 Rsi              : 0x0000003d`c1b7ead8
   +0x0b0 Rdi              : 0x0000003d`c1b7ea28
   +0x0b8 R8               : 0x000001aa`d4e183e8
   +0x0c0 R9               : 0x000001ad`d4bc72e8
   +0x0c8 R10              : 0x000001b0`e462f170
   +0x0d0 R11              : 0x4f977cc5
   +0x0d8 R12              : 0x000001aa`d4e182e8
   +0x0e0 R13              : 0x00007ffa`e8474928
   +0x0e8 R14              : 1
   +0x0f0 R15              : 0x4545470e
   +0x0f8 Rip              : 0x00007ffa`e9038da0
   +0x100 FltSave          : _XSAVE_FORMAT
   +0x100 Header           : [2] _M128A
   +0x120 Legacy           : [8] _M128A
   +0x1a0 Xmm0             : _M128A
   +0x1b0 Xmm1             : _M128A
   +0x1c0 Xmm2             : _M128A
   +0x1d0 Xmm3             : _M128A
   +0x1e0 Xmm4             : _M128A
   +0x1f0 Xmm5             : _M128A
   +0x200 Xmm6             : _M128A
   +0x210 Xmm7             : _M128A
   +0x220 Xmm8             : _M128A
   +0x230 Xmm9             : _M128A
   +0x240 Xmm10            : _M128A
   +0x250 Xmm11            : _M128A
   +0x260 Xmm12            : _M128A
   +0x270 Xmm13            : _M128A
   +0x280 Xmm14            : _M128A
   +0x290 Xmm15            : _M128A
   +0x300 VectorRegister   : [26] _M128A
   +0x4a0 VectorControl    : 0x00000160`000000f0
   +0x4a8 DebugControl     : 0
   +0x4b0 LastBranchToRip  : 0
   +0x4b8 LastBranchFromRip : 0
   +0x4c0 LastExceptionToRip : 0
   +0x4c8 LastExceptionFromRip : 0
  1. ILStubManager::TraceManager incorrectly classifies the stub as being either a p/invoke or a CLR-to-COM call because it is the default case of a large if/else if chain:
    if(pStubMD->IsMulticastStub()) { ... }
    else if(pStubMD->IsReverseStub()) { ... }
    else if(pStubMD->IsDelegateStub()) { ... }
    else if(pStubMD->IsStubCALLIStub()) { ... }
    else if(pStubMD->IsStructMarshalStub() { ... }
    else 
    {
    // This case is only intended to handle p/invoke and CLR-to-COM call, however no earlier condition matched
    // IL instantiated stub so execution enters here.
    }

I'm pretty sure the bug is right here in the incorrect classification, but execution continues a little further doing calculations based on garbage data...

  1. ILStubManager::TraceManager calculates arg, the hidden argument, by reading from R10 in the CONTEXT above: 0x000001b0e462f170. pMD = arg pCMD = pMD pCMD->m_pComPlusCallInfo = 00007ffa4f977cc5 As best I understand R10 is not part of the expected calling convention for invoking an instantiating stub so these computations are based on whatever arbitrary value happened to be in the register at the time.
0:035> ?? (ComPlusCallMethodDesc*)0x000001b0e462f170
class ComPlusCallMethodDesc * 0x000001b0`e462f170
   +0x000 m_wFlags3AndTokenRemainder : 0
   +0x002 m_chunkIndex     : 0 ''
   +0x003 m_bFlags2        : 0 ''
   +0x004 m_wSlotNumber    : 0
   +0x006 m_wFlags         : 0
   =00007ffb`482cb6c0 s_ClassificationSizeTable : [0]  "???"
   +0x008 m_pComPlusCallInfo : 0x00007ffa`4f977cc5 ComPlusCallInfo
  1. Inside ILStubManager::GetCOMTarget the runtime evaluates the expression pComPlusCallInfo->m_pInterfaceMT which is what ultimately triggers the AV that surfaces back to the developer as the ExecutionEngineException
0:035> ?? (ComPlusCallInfo*)0x00007ffa4f977cc5
struct ComPlusCallInfo * 0x00007ffa`4f977cc5
   +0x000 m_pILStub        : ??
   +0x000 m_pEventProviderMD : ???? 
   +0x008 m_pInterfaceMT   : ???? 
   +0x010 m_flags          : ??
   +0x012 m_cachedComSlot  : ??
   +0x018 m_pStubMD        : PlainPointer<MethodDesc *>

The AV occurs at 00007ffb48118e04 which is dereferencing r14+8 = 0x00007ffa4f977cc5+8 = 0x00007ffa4f977ccd

0:035> u
coreclr!ILStubManager::GetCOMTarget+0x74 [D:\a\_work\1\s\src\coreclr\vm\stubmgr.cpp @ 1654]:
00007ffb`48118e04 498b5608        mov     rdx,qword ptr [r14+8]
00007ffb`48118e08 488d4d20        lea     rcx,[rbp+20h]
00007ffb`48118e0c e8c7380c00      call    coreclr!ComObject::GetComIPFromRCWThrowing (00007ffb`481dc6d8)
00007ffb`48118e11 488bf8          mov     rdi,rax
00007ffb`48118e14 4885c0          test    rax,rax
00007ffb`48118e17 750a            jne     coreclr!ILStubManager::GetCOMTarget+0x93 (00007ffb`48118e23)
00007ffb`48118e19 33db            xor     ebx,ebx
00007ffb`48118e1b 48895dc0        mov     qword ptr [rbp-40h],rbx

0:035> dx Debugger.Sessions[0].Processes[28316].Threads[23536].Stack.Frames[24].SwitchTo()
0:035> ?? pExceptionInfo->ExceptionRecord->ExceptionInformation[1]
unsigned int64 0x00007ffa`4f977ccd
jkotas commented 2 years ago

Duplicate of https://github.com/dotnet/runtime/issues/38777 ? The proposed fix: https://github.com/dotnet/runtime/issues/38777#issuecomment-655488029

AaronRobinsonMSFT commented 2 years ago

/cc @dotnet/interop-contrib

noahfalk commented 2 years ago

Ah thanks! I had some recollection talking about this issue before, but I missed #38777 and found only #36248. My take is that each of these issues represents a distinct type of ILStub so they aren't quite duplicates, but a common solution that would cover all of them would be to ensure that StubManager::TraceManager handles all ILStub types produced by the runtime.

noahfalk commented 2 years ago

Here is a simplified repro case:

  1. Create a new console app project with this code in Visual Studio targetting .NET 6
    
    IClient client = new Client();
    client.SendPacketAsync<object>(1, 2, 3, 4);
    client.PublishAsync();

static class Extensions { public static void PublishAsync( this IClient client) { client.SendPacketAsync(1,2,3,4); // <----- set breakpoint here } }

interface IClient { public void SendPacketAsync(int arg1, int arg2, int arg3, int arg4); }

class Client : IClient { public void SendPacketAsync(int arg1, int arg2, int arg3, int arg4) { Console.WriteLine(typeof(T).Name); } }


2. Set breakpoint at the commented line
3. Turn off JMC
4. Run the program under the debugger (F5)
5. When the code reaches the breakpoint, step in

Expected result: execution pauses inside SendPacketAsync()
Actual result: VS shows a dialog for an unhandled ExecutionEngineException
Kuinox commented 2 years ago

Thanks !