dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.94k stars 4.64k forks source link

Access Violation on x86 #12113

Closed ayende closed 4 years ago

ayende commented 5 years ago

We have a scenario in which we are getting an access violation exception during a particular load in our system. I have a full dump of the process when the crash happened, available here: https://drive.google.com/file/d/11oqZaegxKcoNT8Xj1u9YDIcH7LcmMBsW/view?usp=sharing

The scenario we have is a few servers running and communicating with one another on the same process. This is part of our test setup. We recently started seeing hard failures, such as this one: (d2c.3390): Access violation - code c0000005 (first chance)

The event log reports:

Application: dotnet.exe
CoreCLR Version: 4.6.27317.3
Description: The process was terminated due to an internal error in the .NET Runtime at IP 5A29AA9B (5A230000) with exit code 80131506.

Faulting application name: dotnet.exe, version: 2.1.27130.1, time stamp: 0x5c007ea0
Faulting module name: coreclr.dll, version: 4.6.27317.3, time stamp: 0x5c40c18e
Exception code: 0xc0000005
Fault offset: 0x0006aa9b
Faulting process id: 0x52d4
Faulting application start time: 0x01d4cc3f7d83fa59
Faulting application path: C:\Program Files (x86)\dotnet\dotnet.exe
Faulting module path: C:\Users\ayende\.nuget\packages\runtime.win-x86.microsoft.netcore.app\2.1.8\runtimes\win-x86\native\coreclr.dll
Report Id: ff99edfe-baf1-431a-9ce3-06e987219e6c
Faulting package full name: 
Faulting package-relative application ID: 

This machine has the following hot fixes applies:

PS C:\Windows\SysWOW64> Get-HotFix

Source        Description      HotFixID      InstalledBy          InstalledOn
------        -----------      --------      -----------          -----------
OREN-PC       Update           KB4100347     NT AUTHORITY\SYSTEM  2/18/2019 12:00:00 AM
OREN-PC       Update           KB4343669     NT AUTHORITY\SYSTEM  7/11/2018 12:00:00 AM
OREN-PC       Update           KB4456655     NT AUTHORITY\SYSTEM  9/13/2018 12:00:00 AM
OREN-PC       Security Update  KB4465663     NT AUTHORITY\SYSTEM  11/14/2018 12:00:00 AM
OREN-PC       Security Update  KB4471331     NT AUTHORITY\SYSTEM  12/6/2018 12:00:00 AM
OREN-PC       Security Update  KB4477137     NT AUTHORITY\SYSTEM  12/12/2018 12:00:00 AM
OREN-PC       Security Update  KB4480979     NT AUTHORITY\SYSTEM  1/9/2019 12:00:00 AM
OREN-PC       Security Update  KB4485449     NT AUTHORITY\SYSTEM  2/19/2019 12:00:00 AM
OREN-PC       Security Update  KB4487038     NT AUTHORITY\SYSTEM  2/19/2019 12:00:00 AM
OREN-PC       Security Update  KB4480966     HRHINOS\Ayende       2/21/2019 12:00:00 AM
OREN-PC       Security Update  KB4487017

The actual stack we are seeing is always something similar to:

0:043> kp
 # ChildEBP RetAddr  
00 (Inline) -------- coreclr!VolatileLoad+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 153] 
01 (Inline) -------- coreclr!Volatile<unsigned long>::Load+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 292] 
02 (Inline) -------- coreclr!Volatile<unsigned long>::operator unsigned long+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 346] 
03 (Inline) -------- coreclr!RelativePointer<Module *>::GetValue+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 68] 
04 (Inline) -------- coreclr!RelativePointer<Module *>::GetValueAtPtr+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 85] 
05 (Inline) -------- coreclr!ReadPointer+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 954] 
06 (Inline) -------- coreclr!ReadPointer+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 954] 
07 (Inline) -------- coreclr!MethodTable::GetLoaderModule+0x3 [e:\a\_work\335\s\src\vm\methodtable.inl @ 176] 
08 (Inline) -------- coreclr!MethodTable::GetLoaderAllocator+0x3 [e:\a\_work\335\s\src\vm\methodtable.inl @ 182] 
09 0e60d0a0 5a299c42 coreclr!VirtualCallStubManager::ResolveWorker(struct StubCallSite * pCallSite = 0x0e60d13c, class Object ** protectedObj = 0x0e60d180, struct DispatchToken token = struct DispatchToken, VirtualCallStubManager::StubKind stubKind = SK_DISPATCH (0n2))+0x6b [e:\a\_work\335\s\src\vm\virtualcallstub.cpp @ 1719] 
0a 0e60d168 5a34a26b coreclr!VSD_ResolveWorker(struct TransitionBlock * pTransitionBlock = 0x0e60d17c, unsigned long siteAddrForRegisterIndirect = 0, unsigned int token = 0xa0000)+0x24f [e:\a\_work\335\s\src\vm\virtualcallstub.cpp @ 1611] 
0b 0e60d190 0e995fca coreclr!ResolveWorkerAsmStub(void)+0x1b [e:\a\_work\335\s\src\vm\i386\virtualcallstubcpu.hpp @ 525] 
WARNING: Frame IP not in any known module. Following frames may be wrong.
0c 0e60d194 5a33afeb 0xe995fca
0d 0e60d1a8 5a47fc7f coreclr!CallJitEHFinallyHelper(void)+0x28 [E:\A\_work\335\s\src\vm\i386\asmhelpers.asm @ 390] 
0e 0e60d200 5a3a2acc coreclr!CallJitEHFinally(class CrawlFrame * pCf = 0x43a07fdc, unsigned char * startPC = <Value unavailable error>, struct EE_ILEXCEPTION_CLAUSE * EHClausePtr = 0x0e60d260, unsigned long nestingLevel = 0)+0xb8 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 3385] 
0f 0e60d2d0 5a249cf5 coreclr!COMPlusUnwindCallback+0x15a70c [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 2996] 
10 (Inline) -------- coreclr!Thread::MakeStackwalkerCallback+0x151 [e:\a\_work\335\s\src\vm\stackwalk.cpp @ 877] 
11 0e60d59c 5a2525c1 coreclr!Thread::StackWalkFramesEx(struct REGDISPLAY * pRD = 0x0e60d5c8, <function> * pCallback = 0x0ddb401c, void * pData = 0x0e60d954, unsigned int flags = 4, class Frame * pStartFrame = 0x00000000)+0x1d4 [e:\a\_work\335\s\src\vm\stackwalk.cpp @ 958] 
12 0e60d8d0 5a251f60 coreclr!Thread::StackWalkFrames(<function> * pCallback = 0x5a2483c0, void * pData = 0x0e60d954, unsigned int flags = 4, class Frame * pStartFrame = 0x00000000)+0xa1 [e:\a\_work\335\s\src\vm\stackwalk.cpp @ 1042] 
13 0e60d8f0 5a252a52 coreclr!UnwindFrames(class Thread * pThread = 0x0ddb3ea8, struct ThrowCallbackType * tct = 0x0e60d954)+0x3e [e:\a\_work\335\s\src\vm\excep.cpp @ 2228] 
14 (Inline) -------- coreclr!COMPlusAfterUnwind+0x98 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 482] 
15 0e60db48 5a2512fa coreclr!CPFH_RealFirstPassHandler(struct _EXCEPTION_RECORD * pExceptionRecord = 0x0e60dcb0, struct _EXCEPTION_REGISTRATION_RECORD * pEstablisherFrame = <Value unavailable error>, struct _CONTEXT * pContext = 0x0e60dd00, int bAsynchronousThreadStop = 0n0)+0x459 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 1263] 
16 0e60db88 5a250ef3 coreclr!CPFH_FirstPassHandler(struct _EXCEPTION_RECORD * pExceptionRecord = 0x0e60dcb0, struct _EXCEPTION_REGISTRATION_RECORD * pEstablisherFrame = 0x0e60e650, struct _CONTEXT * pContext = 0x0e60dd00)+0xc3 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 1401] 
17 0e60dbac 7772f1a2 coreclr!COMPlusFrameHandler(struct _EXCEPTION_RECORD * pExceptionRecord = 0x0e60dcb0, struct _EXCEPTION_REGISTRATION_RECORD * pEstablisherFrame = 0x0e60e650, struct _CONTEXT * pContext = 0x0e60dd00, struct _DISPATCHER_CONTEXT * pDispatcherContext = 0x0e60dc38)+0x83 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 1821] 
18 0e60dbd0 7772f174 ntdll!ExecuteHandler2+0x26
19 0e60dc98 7771cd86 ntdll!ExecuteHandler+0x24
1a 0e60dc98 76f21812 ntdll!KiUserExceptionDispatcher+0x26
1b 0e60e1bc 5a252d94 KERNELBASE!RaiseException+0x62
1c 0e60e264 5a3248fb coreclr!RaiseTheExceptionInternalOnly(class Object * throwable = <Value unavailable error>, int rethrow = <Value unavailable error>, int fForStackOverflow = 0n0)+0x11d [e:\a\_work\335\s\src\vm\excep.cpp @ 3039] 
1d 0e60e32c 2ed2b3e4 coreclr!IL_Throw(class Object * obj = <Value unavailable error>)+0x11b [e:\a\_work\335\s\src\vm\jithelpers.cpp @ 4860] 
1e 0e60e33c 0e3bd604 0x2ed2b3e4
1f 0e60e354 0e99a747 0xe3bd604
20 0e60e384 0e999f92 0xe99a747
21 0e60e3b0 0eecd3c9 0xe999f92
22 0e60e400 0e995ad1 0xeecd3c9
23 0e60e578 0e9934e6 0xe995ad1
24 0e60e5b0 0c3c7770 0xe9934e6
25 0e60e5d8 593868dd 0xc3c7770
26 0e60e5f4 59cfa31d System_Threading_Thread+0x68dd
27 0e60e624 59cfc0cc System_Private_CoreLib+0x4ca31d
28 0e60e638 5a33b0ef System_Private_CoreLib+0x4cc0cc
29 0e60e644 5a26fbf1 coreclr!CallDescrWorkerInternal(unsigned long pParams = 0xe60e6b8)+0x34 [E:\A\_work\335\s\src\vm\i386\asmhelpers.asm @ 618] 
2a (Inline) -------- coreclr!CallDescrWorkerWithHandler+0x52 [e:\a\_work\335\s\src\vm\callhelpers.cpp @ 78] 
2b 0e60e6e4 5a32d4e4 coreclr!MethodDescCallSite::CallTargetWorker(unsigned int64 * pArguments = 0x0e60e730, unsigned int64 * pReturnValue = 0x00000000, int cbReturnValue = 0n0)+0x235 [e:\a\_work\335\s\src\vm\callhelpers.cpp @ 620] 
2c 0e60e7bc 5a342b06 coreclr!ThreadNative::KickOffThread_Worker(void * ptr = 0x0e60e948)+0x104 [e:\a\_work\335\s\src\vm\comsynchronizable.cpp @ 260] 
2d 0e60e7d4 5a26f86a coreclr!ManagedThreadBase_DispatchInner(struct ManagedThreadCallState * pCallState = <Value unavailable error>)+0x70 [e:\a\_work\335\s\src\vm\threads.cpp @ 8852] 
2e 0e60e880 5a26f7bb coreclr!ManagedThreadBase_DispatchMiddle(struct ManagedThreadCallState * pCallState = <Value unavailable error>)+0x65 [e:\a\_work\335\s\src\vm\threads.cpp @ 8902] 
2f 0e60e8e4 5a3352b9 coreclr!ManagedThreadBase_DispatchOuter(struct ManagedThreadCallState * pCallState = 0x0e60e8ec)+0x78 [e:\a\_work\335\s\src\vm\threads.cpp @ 9161] 
30 0e60e908 5a2d9e7c coreclr!ManagedThreadBase_FullTransitionWithAD(struct ADID pAppDomain = struct ADID, <function> * pTarget = <Value unavailable error>, void * args = <Value unavailable error>, UnhandledExceptionLocation filterType = ManagedThread (0n2))+0x2f [e:\a\_work\335\s\src\vm\threads.cpp @ 9200] 
31 (Inline) -------- coreclr!ManagedThreadBase::KickOff+0x15 [e:\a\_work\335\s\src\vm\threads.cpp @ 9234] 
32 0e60e984 5a2d9d90 coreclr!ThreadNative::KickOffThread(void * pass = 0x0ad364d0)+0xcc [e:\a\_work\335\s\src\vm\comsynchronizable.cpp @ 380] 
33 0e60f824 76388484 coreclr!Thread::intermediateThreadProc(void * arg = 0x0abac020)+0x50 [e:\a\_work\335\s\src\vm\threads.cpp @ 2255] 
34 0e60f838 77713ab8 KERNEL32!BaseThreadInitThunk+0x24
35 0e60f880 77713a88 ntdll!__RtlUserThreadStart+0x2f
36 0e60f890 00000000 ntdll!_RtlUserThreadStart+0x1b

The managed stack, FWIW, is:

0:043> !clrstack
OS Thread Id: 0x3390 (43)
Child SP       IP Call Site
0e60d0d8 5a29aa9b [GCFrame: 0e60d0d8] 
0e60d118 5a29aa9b [StubDispatchFrame: 0e60d118] System.IDisposable.Dispose()
0e60d198 0e995fca Raven.Server.Rachis.FollowerAmbassador.Run()
0e60e580 0e9934e6 Raven.Server.Rachis.FollowerAmbassador.b__58_0(System.Object)
0e60e584 0c3c8000 Raven.Server.Utils.PoolOfThreads+PooledThread.DoWork()
0e60e5b8 0c3c7770 Raven.Server.Utils.PoolOfThreads+PooledThread.Run()
0e60e5e0 593868dd System.Threading.Thread.ThreadMain_ThreadStart() [E:\A\_work\321\s\corefx\src\System.Threading.Thread\src\System\Threading\Thread.cs @ 93]
0e60e5e8 59cfc00c System.Threading.ThreadHelper.ThreadStart_Context(System.Object) [E:\A\_work\335\s\src\mscorlib\src\System\Threading\Thread.cs @ 62]
0e60e5fc 59cfa31d System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [E:\A\_work\335\s\src\mscorlib\shared\System\Threading\ExecutionContext.cs @ 167]
0e60e630 59cfc0cc System.Threading.ThreadHelper.ThreadStart() [E:\A\_work\335\s\src\mscorlib\src\System\Threading\Thread.cs @ 91]
0e60e718 5a33b0ef [GCFrame: 0e60e718] 
0e60e8a8 5a33b0ef [DebuggerU2MCatchHandlerFrame: 0e60e8a8] 

We are using unsafe code, but we are pretty sure that we aren't corrupting the heap in any manner (lots of tests cover that) and if we were, I would expect to see the failure in different locations.

From trying to figure out what is going on, a few really strange things seem to be happening here:

Here is the actual failure:

FAULTING_IP: 
KERNELBASE!RaiseException+62
76f21812 8b4c2454        mov     ecx,dword ptr [esp+54h]

And the full register usage is:

(d2c.3390): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00e527a0 ecx=00000010 edx=15554140 esi=5a339280 edi=00e527a0
eip=5a29aa9b esp=0e60ce44 ebp=0e60d0a0 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246

As you can see, the esp value has a non null value, but checking the memory location with the offset provided to the instruction shows just zeros.

While troubleshooting this, we found a NullReferenceException in our code. We fixed it, but that made the problem go away. We suspect that this is some issue related to error handling inside the CoreCLR during JIT generation. We have run into a different issue with KB4487017 (See: https://github.com/dotnet/coreclr/issues/22597), but we are reproducing this on different versions of Windows and without the KB in question.

We aren't able to reproduce this issue in 64 bits.

ayende commented 5 years ago

We are also seeing something similar with this stack trace:

15ddd310 503a91a4 15ddd3ac 15ddd3f0 000a0000 coreclr!coreclr_shutdown_2+0x347bb
15ddd3d8 5043057b 15ddd3ec 00000000 000a0000 coreclr!coreclr_shutdown_2+0x33924
15ddd400 0fac80c5 504207bb 03469698 503cee20 coreclr!MetaDataGetDispenser+0x4753b
15dde8f8 0fac49e6 0d58d430 00000000 05715594 0xfac80c5
15dde930 0d58ceb0 05715594 00000000 00000000 0xfac49e6
15dde958 544e68dd 0efc6344 30aadc5c 0efc6344 0xd58ceb0
15dde974 30aabf6d 00000000 00000000 05715640 System_Threading_Thread!System.Threading.Thread.ThreadMain_ThreadStart()$##600001C+0x35
15dde9a4 30aadd1c 05715668 15dde9d0 12781560 System_Private_CoreLib!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)$##600288D+0x65
15dde9b8 504208bf 15ddeac8 15ddea64 5033a6d0 System_Private_CoreLib!System.Threading.ThreadHelper.ThreadStart()$##6002636+0x2c
15dde9c4 5033a6d0 15ddea38 15ddebf4 50387b70 coreclr!MetaDataGetDispenser+0x3787f
15ddea64 504cbcd2 15ddeab0 00000000 00000000 coreclr+0x2a6d0
15ddeb3c 5042826d 15ddecc8 104f5b50 15ddec6c coreclr!MetaDataGetDispenser+0xe2c92
15ddeb54 5033a3a5 f762de38 00000001 15ddec6c coreclr!MetaDataGetDispenser+0x3f22d
15ddec00 5033a2f6 f762de5c 00000001 12781560 coreclr+0x2a3a5
15ddec64 5040b09b 00000001 00000000 00000001 coreclr+0x2a2f6
15ddec88 50375c0c 00000001 00000002 f762df3c coreclr!MetaDataGetDispenser+0x2205b
15dded04 50375b20 127fab48 00000000 00000000 coreclr!coreclr_shutdown_2+0x38c
15ddfc9c 74ee8484 1273fc78 74ee8460 42821e2e coreclr!coreclr_shutdown_2+0x2a0
15ddfcb0 77ec3ab8 1273fc78 2d5961f5 00000000 KERNEL32!BaseThreadInitThunk+0x24
15ddfcf8 77ec3a88 ffffffff 77edf31d 00000000 ntdll!__RtlUserThreadStart+0x2f
15ddfd08 00000000 50375ad0 1273fc78 00000000 ntdll!_RtlUserThreadStart+0x1b
filipnavara commented 5 years ago

I see quite a different managed stack trace in the dump you provided:

00b7e400 7771ae8c [HelperMethodFrame_1OBJ: 00b7e400] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)
00b7e480 59db9604 System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken)
00b7e4d4 59d9062a System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken)
00b7e510 59d90500 System.Threading.Tasks.Task.InternalWaitCore(Int32, System.Threading.CancellationToken)
00b7e53c 59dc76d9 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)
00b7e548 07011159 Tryouts.Program.(System.String[])

The unmanaged one is also quite different too and ends in:

 # ChildEBP RetAddr  Args to Child              
00 00b7e03c 76f15983 00000001 00e7ec30 00000001 ntdll!NtWaitForMultipleObjects+0xc
01 00b7e1d0 5a254f2f 00000001 00e7ec30 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x133
02 (Inline) -------- -------- -------- -------- coreclr!WaitForMultipleObjectsEx_SO_TOLERANT+0x13 [e:\a\_work\335\s\src\vm\threads.cpp @ 3462] 
03 00b7e1f0 5a254ca8 00000001 00e7ec30 00000000 coreclr!Thread::DoAppropriateAptStateWait+0x39 [e:\a\_work\335\s\src\vm\threads.cpp @ 3496] 
04 00b7e280 5a254a07 00000001 00e7ec30 00000000 coreclr!Thread::DoAppropriateWaitWorker+0x248 [e:\a\_work\335\s\src\vm\threads.cpp @ 3633] 
05 00b7e300 5a25497c 00000001 00e7ec30 00000000 coreclr!Thread::DoAppropriateWait+0x79 [e:\a\_work\335\s\src\vm\threads.cpp @ 3307] 
06 00b7e324 5a2d8efb ffffffff 00000001 00b7e3a8 coreclr!CLREventBase::WaitEx+0x39 [e:\a\_work\335\s\src\vm\synch.cpp @ 486] 
07 00b7e33c 5a254226 ffffffff 00000001 00b7e3a8 coreclr!CLREventBase::Wait+0x1a [e:\a\_work\335\s\src\vm\synch.cpp @ 427] 
08 (Inline) -------- -------- -------- -------- coreclr!Thread::Wait+0x19 [e:\a\_work\335\s\src\vm\threads.cpp @ 4113] 
09 (Inline) -------- -------- -------- -------- coreclr!Thread::Block+0x22 [e:\a\_work\335\s\src\vm\threads.cpp @ 4070] 
0a 00b7e3d8 5a254079 ffffffff 00e7eb18 2c559a49 coreclr!SyncBlock::Wait+0x173 [e:\a\_work\335\s\src\vm\syncblk.cpp @ 3376] 
0b (Inline) -------- -------- -------- -------- coreclr!ObjHeader::Wait+0x23 [e:\a\_work\335\s\src\vm\syncblk.cpp @ 2789] 
0c (Inline) -------- -------- -------- -------- coreclr!Object::Wait+0x23 [e:\a\_work\335\s\src\vm\object.h @ 391] 
0d 00b7e474 59db9604 0f28f38c 051867d0 051867bc coreclr!ObjectNative::WaitTimeout+0x89 [e:\a\_work\335\s\src\classlibnative\bcltype\objectnative.cpp @ 297] 
0e 00b7e4c8 59d9062a 00000000 0f28f378 051867bc System_Private_CoreLib+0x589604
0f 00b7e504 59d90500 00000000 00b7e5c0 04e38b60 System_Private_CoreLib+0x56062a
10 00b7e530 59dc76d9 00000000 00b7e560 00b7e548 System_Private_CoreLib+0x560500
11 00b7e540 07011159 00b7e554 5a33b0ef 00b7e610 System_Private_CoreLib+0x5976d9
ayende commented 5 years ago

Check thread 35, which is what WinDBG stopped on when I was doing live debugging for this.

0:000> ~35s
eax=00000000 ebx=00e527a0 ecx=00000010 edx=15554140 esi=5a339280 edi=00e527a0
eip=5a29aa9b esp=0e60ce44 ebp=0e60d0a0 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
coreclr!VolatileLoad+0x3 [inlined in coreclr!VirtualCallStubManager::ResolveWorker+0x6b]:
5a29aa9b 8b4114          mov     eax,dword ptr [ecx+14h] ds:002b:00000024=????????
0:035> k
 # ChildEBP RetAddr  
00 (Inline) -------- coreclr!VolatileLoad+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 153] 
01 (Inline) -------- coreclr!Volatile<unsigned long>::Load+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 292] 
02 (Inline) -------- coreclr!Volatile<unsigned long>::operator unsigned long+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 346] 
03 (Inline) -------- coreclr!RelativePointer<Module *>::GetValue+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 68] 
04 (Inline) -------- coreclr!RelativePointer<Module *>::GetValueAtPtr+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 85] 
05 (Inline) -------- coreclr!ReadPointer+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 954] 
06 (Inline) -------- coreclr!ReadPointer+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 954] 
07 (Inline) -------- coreclr!MethodTable::GetLoaderModule+0x3 [e:\a\_work\335\s\src\vm\methodtable.inl @ 176] 
08 (Inline) -------- coreclr!MethodTable::GetLoaderAllocator+0x3 [e:\a\_work\335\s\src\vm\methodtable.inl @ 182] 
09 0e60d0a0 5a299c42 coreclr!VirtualCallStubManager::ResolveWorker+0x6b [e:\a\_work\335\s\src\vm\virtualcallstub.cpp @ 1719] 
0a 0e60d168 5a34a26b coreclr!VSD_ResolveWorker+0x24f [e:\a\_work\335\s\src\vm\virtualcallstub.cpp @ 1611] 
0b 0e60d190 0e995fca coreclr!ResolveWorkerAsmStub+0x1b [e:\a\_work\335\s\src\vm\i386\virtualcallstubcpu.hpp @ 525] 
WARNING: Frame IP not in any known module. Following frames may be wrong.
0c 0e60d194 5a33afeb 0xe995fca
0d 0e60d1a8 5a47fc7f coreclr!CallJitEHFinallyHelper+0x28 [E:\A\_work\335\s\src\vm\i386\asmhelpers.asm @ 390] 
0e 0e60d200 5a3a2acc coreclr!CallJitEHFinally+0xb8 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 3385] 
0f 0e60d2d0 5a249cf5 coreclr!COMPlusUnwindCallback+0x15a70c [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 2996] 
10 (Inline) -------- coreclr!Thread::MakeStackwalkerCallback+0x151 [e:\a\_work\335\s\src\vm\stackwalk.cpp @ 877] 
11 0e60d59c 5a2525c1 coreclr!Thread::StackWalkFramesEx+0x1d4 [e:\a\_work\335\s\src\vm\stackwalk.cpp @ 958] 
12 0e60d8d0 5a251f60 coreclr!Thread::StackWalkFrames+0xa1 [e:\a\_work\335\s\src\vm\stackwalk.cpp @ 1042] 
13 0e60d8f0 5a252a52 coreclr!UnwindFrames+0x3e [e:\a\_work\335\s\src\vm\excep.cpp @ 2228] 
14 (Inline) -------- coreclr!COMPlusAfterUnwind+0x98 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 482] 
15 0e60db48 5a2512fa coreclr!CPFH_RealFirstPassHandler+0x459 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 1263] 
16 0e60db88 5a250ef3 coreclr!CPFH_FirstPassHandler+0xc3 [e:\a\_work\335\s\src\vm\i386\excepx86.cpp @ 1401] 
filipnavara commented 5 years ago

Weird, usually the debugger switches me to correct thread after !analyze -v. Oh well, at least I see the same thing now.

benaadams commented 5 years ago

/cc @kouvel @janvorli

janvorli commented 5 years ago

I am taking a look.

janvorli commented 5 years ago

@ayende I assume you have used 64 bit windbg to open the dump, right? It you open it with the wow64 version of windbg, then the call stack shown is quite different. The frame with unknown IP address is gone and there is no EH. While it is strange that the call stack differs this way, the one showed with Wow64 (which is the right one to use for debugging 32 bit code) matches your managed stack dump, as you can see below. Looking at locals in VSD_ResolveWorker, I can see that the ECX in the TransitionBlock, which should be the "this" reference, points to a garbage. The first 4 bytes should be a MethodTable pointer, but it contains 0x10. And that is causing the crash, as we try to extract LoaderAllocator from that MethodTable. The caller of VSD_ResolveWorker, the ResolveWorkerAsmStub just stores the ECX passed in by the managed caller into the TransitionBlock. So it looks like a GC hole to me. I am looking into it further.

 # ChildEBP RetAddr  
00 (Inline) -------- coreclr!VolatileLoad+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 153] 
01 (Inline) -------- coreclr!Volatile<unsigned long>::Load+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 292] 
02 (Inline) -------- coreclr!Volatile<unsigned long>::operator unsigned long+0x3 [e:\a\_work\335\s\src\inc\volatile.h @ 346] 
03 (Inline) -------- coreclr!RelativePointer<Module *>::GetValue+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 68] 
04 (Inline) -------- coreclr!RelativePointer<Module *>::GetValueAtPtr+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 85] 
05 (Inline) -------- coreclr!ReadPointer+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 954] 
06 (Inline) -------- coreclr!ReadPointer+0x3 [e:\a\_work\335\s\src\inc\fixuppointer.h @ 954] 
07 (Inline) -------- coreclr!MethodTable::GetLoaderModule+0x3 [e:\a\_work\335\s\src\vm\methodtable.inl @ 176] 
08 (Inline) -------- coreclr!MethodTable::GetLoaderAllocator+0x3 [e:\a\_work\335\s\src\vm\methodtable.inl @ 182] 
09 0e60d0a0 5a299c42 coreclr!VirtualCallStubManager::ResolveWorker+0x6b [e:\a\_work\335\s\src\vm\virtualcallstub.cpp @ 1719] 
0a 0e60d168 5a34a26b coreclr!VSD_ResolveWorker+0x24f [e:\a\_work\335\s\src\vm\virtualcallstub.cpp @ 1611] 
0b 0e60d190 0e995fca coreclr!ResolveWorkerAsmStub+0x1b [e:\a\_work\335\s\src\vm\i386\virtualcallstubcpu.hpp @ 525] 
0c 0e60d190 0e9934e6 Raven_Server!Raven.Server.Rachis.FollowerAmbassador.Run()+0x11da
0d 0e60e5b0 0c3c8000 Raven_Server!Raven.Server.Rachis.FollowerAmbassador.<Start>b__58_0(System.Object)+0x6
0e 0e60e5b0 0c3c7770 Raven_Server!Raven.Server.Utils.PoolOfThreads+PooledThread.DoWork()+0xb8
0f 0e60e5d8 593868dd Raven_Server!Raven.Server.Utils.PoolOfThreads+PooledThread.Run()+0xc8
10 0e60e5f4 59cfc00c System_Threading_Thread!System.Threading.Thread.ThreadMain_ThreadStart()+0x35
11 0e60e5f4 59cfa31d System_Private_CoreLib!System.Threading.ThreadHelper.ThreadStart_Context(System.Object)+0x5c
12 0e60e624 59cfc0cc System_Private_CoreLib!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)+0x65
13 0e60e638 5a33b0ef System_Private_CoreLib!System.Threading.ThreadHelper.ThreadStart()+0x2c
14 0e60e644 5a26fbf1 coreclr!CallDescrWorkerInternal+0x34 [E:\A\_work\335\s\src\vm\i386\asmhelpers.asm @ 618] 
15 (Inline) -------- coreclr!CallDescrWorkerWithHandler+0x52 [e:\a\_work\335\s\src\vm\callhelpers.cpp @ 78] 
16 0e60e6e4 5a32d4e4 coreclr!MethodDescCallSite::CallTargetWorker+0x235 [e:\a\_work\335\s\src\vm\callhelpers.cpp @ 620] 
17 0e60e7bc 5a342b06 coreclr!ThreadNative::KickOffThread_Worker+0x104 [e:\a\_work\335\s\src\vm\comsynchronizable.cpp @ 260] 
18 0e60e7d4 5a26f86a coreclr!ManagedThreadBase_DispatchInner+0x70 [e:\a\_work\335\s\src\vm\threads.cpp @ 8852] 
19 0e60e880 5a26f7bb coreclr!ManagedThreadBase_DispatchMiddle+0x65 [e:\a\_work\335\s\src\vm\threads.cpp @ 8902] 
1a 0e60e8e4 5a3352b9 coreclr!ManagedThreadBase_DispatchOuter+0x78 [e:\a\_work\335\s\src\vm\threads.cpp @ 9161] 
1b 0e60e908 5a2d9e7c coreclr!ManagedThreadBase_FullTransitionWithAD+0x2f [e:\a\_work\335\s\src\vm\threads.cpp @ 9200] 
1c (Inline) -------- coreclr!ManagedThreadBase::KickOff+0x15 [e:\a\_work\335\s\src\vm\threads.cpp @ 9234] 
1d 0e60e984 5a2d9d90 coreclr!ThreadNative::KickOffThread+0xcc [e:\a\_work\335\s\src\vm\comsynchronizable.cpp @ 380] 
1e 0e60f824 76388484 coreclr!Thread::intermediateThreadProc+0x50 [e:\a\_work\335\s\src\vm\threads.cpp @ 2255] 
1f 0e60f838 77713ab8 kernel32!BaseThreadInitThunk+0x24 [base\win32\client\thread.c @ 64] 
20 0e60f880 77713a88 ntdll!__RtlUserThreadStart+0x2f [minkernel\ntdll\rtlstrt.c @ 998] 
21 0e60f890 00000000 ntdll!_RtlUserThreadStart+0x1b [minkernel\ntdll\rtlstrt.c @ 915]
ayende commented 5 years ago

If this helps, it is fairly easy to reproduce. The code is here: https://github.com/ravendb/ravendb with commit hash 6aa5833c5c6470f714d5432b10299a02d8d11639.

You can go to tests/Tryout folder and just execute dotnet run in x86 and it will fail in the first 100 iterations or so in most cases

janvorli commented 5 years ago

That's great, I was just about to ask if the code is open source.

janvorli commented 5 years ago

I've collected the following additional details:

The call at offset 0x11D4 - address 0e995fc4 (return address at offset 0x11DA - 0e995fca) is passed an object reference in ECX which is extracted from [ebp-134h] right before the call. The call site is in a catch handler and the [ebp-134h] is initialized before the respective try.

janvorli commented 5 years ago

@ayende I've tried to repro the issue, but the commit doesn't exist (I've found another one that was result of merging in the other commit) and so that's what I've checked out: https://github.com/ravendb/ravendb/pull/8784.

The dotnet run fails to build with:

obj\Debug\netcoreapp2.1\Raven.Server.AssemblyInfo.cs(10,47): error CS0234: The type or namespace name 'UserSecrets' does not exist in the namespace 'Microsoft.Extensions.Configuration' (are you missing an assembly reference?) [D:\git\ravendb\src\Raven.Server\Raven.Server.csproj]

The build failed. Please fix the build errors and run again.
ayende commented 5 years ago

Can you try f04980d83c3cf83e5b8ac4306ea1c15765df0237 ? The command I run is: & 'C:\Program Files (x86)\dotnet\dotnet.exe' run inside the test\Tryouts folder. You might need to run the build.ps1 folder from the root folder

janvorli commented 5 years ago

Somehow the 3.0 SDK was failing to build it even though it should be able to target older SDKs. Now I was able to build it successfuly, but running it fails:

Unhandled Exception: System.AggregateException: One or more errors occurred. (Failed to read license from 'D:\git\ravendb\test\Tryouts\bin\Debug\netcoreapp2.1\license.json' path.) ---> Raven.Client.Exceptions.Commercial.LicenseActivationException: Failed to read license from 'D:\git\ravendb\test\Tryouts\bin\Debug\netcoreapp2.1\license.json' path. ---> System.IO.FileNotFoundException: Could not find file 'D:\git\ravendb\test\Tryouts\bin\Debug\netcoreapp2.1\license.json'.
ayende commented 5 years ago

Sorry about that. Forgot that you need it. You can fix this by first running this command:

$env:RAVEN_License='{"Id":"933260e7-a7f0-47da-981d-bdd5662ce3fb","Name":"Testing McTester","Keys":["A+8f8x3pAiwfk2xmk4F7DBqMa","B6hyAebFwYY1PmFMG5UXlINTd","WY/rIk9ex/eTQUpqo4xMCP8SE","1Axf9wywplVih6qPyYHewBw8v","PU+l4DTpVcqHhM3qKOWO+mfvQ","NrjBU6ahBt9xjOU9tog1HDs51","xYhq2MAVWD/Gm6E/DJZShABUE","oBSYoSQMqKywtLi8wJzFDJEQJ","Yl1Q"]}'

Please note that I'm running this on CoreCLR 2.1.8, not 3.0

We haven't verified this issue on other versions of the runtime.

janvorli commented 5 years ago

Please note that I'm running this on CoreCLR 2.1.8, not 3.0

I know. But it should be possible to build using 3.0 targetting older versions - it should just honor the versions specified in the project files.

janvorli commented 5 years ago

Ok, I was able to start the test now, but it is getting one exception at the beginning and then some timeout exceptions:

0
        To attach debugger to test process (x86), use proc-id: 6996.
Max number of concurrent tests is: 4

Unhandled Exception: System.AggregateException: One or more errors occurred. (Tried to send 'DeleteDatabaseCommand' request via `DELETE /admin/databases` to all configured nodes in the topology, none of the attempt succeeded.
I was able to fetch  topology from http://127.0.0.1:51592.
Topology:
[Url: http://127.0.0.1:51592, ClusterTag: A, ServerRole: None, Exception: System.TimeoutException: Waited for 00:00:15 but didn't get index notification for 36. Last commit index is: 34. Number of errors is: 0.
Index: 2. Type: PutLicenseCommand. ExecutionTime: 00:00:00.0048047. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 464. Exception:
Index: 3. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.1255138. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 634. Exception:
Index: 5. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0029094. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 990. Exception:
Index: 8. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0011608. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 1350. Exception:
Index: 11. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0026476. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 1486. Exception:
Index: 14. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0168839. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 1701. Exception:
Index: 16. Type: AddDatabaseCommand. ExecutionTime: 00:00:00.0157038. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 2234. Exception:
Index: 18. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0054394. Term: 3. LeaderErrorCount: 0. LeaderShipDuration: 546. Exception:
Index: 20. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0011850. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 23. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0012422. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 24. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0014008. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 27. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0012933. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 29. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0014345. Term: 11. LeaderErrorCount: 0. LeaderShipDuration: 250. Exception:
Index: 30. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0015241. Term: 11. LeaderErrorCount: 0. LeaderShipDuration: 474. Exception:
Index: 32. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0013954. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 34. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0012890. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 36. Type: DeleteDatabaseCommand. ExecutionTime: 00:00:00.0016575. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:

   at Raven.Server.ServerWide.RachisLogIndexNotifications.ThrowTimeoutException(TimeSpan value, Int64 index, Int64 lastModifiedIndex) in D:\git\ravendb\src\Raven.Server\ServerWide\ClusterStateMachine.cs:line 1867
   at Raven.Server.ServerWide.RachisLogIndexNotifications.WaitForIndexNotification(Int64 index, TimeSpan timeout) in D:\git\ravendb\src\Raven.Server\ServerWide\ClusterStateMachine.cs:line 1851
   at Raven.Server.ServerWide.ClusterStateMachine.WaitForIndexNotification(Int64 index, Nullable`1 timeout) in D:\git\ravendb\src\Raven.Server\ServerWide\ClusterStateMachine.cs:line 677
   at Raven.Server.Web.System.AdminDatabasesHandler.Delete() in D:\git\ravendb\src\Raven.Server\Web\System\AdminDatabasesHandler.cs:line 628
   at Raven.Server.Routing.RequestRouter.HandlePath(RequestHandlerContext reqCtx) in D:\git\ravendb\src\Raven.Server\Routing\RequestRouter.cs:line 129
   at Raven.Server.RavenServerStartup.RequestHandler(HttpContext context) in D:\git\ravendb\src\Raven.Server\RavenServerStartup.cs:line 173
The server at /admin/databases responded with status code: RequestTimeout. -> System.TimeoutException: Waited for 00:00:15 but didn't get index notification for 36. Last commit index is: 34. Number of errors is: 0.
Index: 2. Type: PutLicenseCommand. ExecutionTime: 00:00:00.0048047. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 464. Exception:
Index: 3. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.1255138. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 634. Exception:
Index: 5. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0029094. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 990. Exception:
Index: 8. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0011608. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 1350. Exception:
Index: 11. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0026476. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 1486. Exception:
Index: 14. Type: PutLicenseLimitsCommand. ExecutionTime: 00:00:00.0168839. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 1701. Exception:
Index: 16. Type: AddDatabaseCommand. ExecutionTime: 00:00:00.0157038. Term: 1. LeaderErrorCount: 0. LeaderShipDuration: 2234. Exception:
Index: 18. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0054394. Term: 3. LeaderErrorCount: 0. LeaderShipDuration: 546. Exception:
Index: 20. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0011850. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 23. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0012422. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 24. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0014008. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 27. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0012933. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 29. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0014345. Term: 11. LeaderErrorCount: 0. LeaderShipDuration: 250. Exception:
Index: 30. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0015241. Term: 11. LeaderErrorCount: 0. LeaderShipDuration: 474. Exception:
Index: 32. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0013954. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 34. Type: ClusterTransactionCommand. ExecutionTime: 00:00:00.0012890. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:
Index: 36. Type: DeleteDatabaseCommand. ExecutionTime: 00:00:00.0016575. Term: . LeaderErrorCount: . LeaderShipDuration: . Exception:

   at Raven.Server.ServerWide.RachisLogIndexNotifications.ThrowTimeoutException(TimeSpan value, Int64 index, Int64 lastModifiedIndex) in D:\git\ravendb\src\Raven.Server\ServerWide\ClusterStateMachine.cs:line 1867
   at Raven.Server.ServerWide.RachisLogIndexNotifications.WaitForIndexNotification(Int64 index, TimeSpan timeout) in D:\git\ravendb\src\Raven.Server\ServerWide\ClusterStateMachine.cs:line 1851
   at Raven.Server.ServerWide.ClusterStateMachine.WaitForIndexNotification(Int64 index, Nullable`1 timeout) in D:\git\ravendb\src\Raven.Server\ServerWide\ClusterStateMachine.cs:line 677
   at Raven.Server.Web.System.AdminDatabasesHandler.Delete() in D:\git\ravendb\src\Raven.Server\Web\System\AdminDatabasesHandler.cs:line 628
   at Raven.Server.Routing.RequestRouter.HandlePath(RequestHandlerContext reqCtx) in D:\git\ravendb\src\Raven.Server\Routing\RequestRouter.cs:line 129
   at Raven.Server.RavenServerStartup.RequestHandler(HttpContext context) in D:\git\ravendb\src\Raven.Server\RavenServerStartup.cs:line 173
The server at /admin/databases responded with status code: RequestTimeout.]

.... etc
ayende commented 5 years ago

Are you running this inside a debugger, or on some really slow I/O?

janvorli commented 5 years ago

No, without a debugger on my main dev box (Intel XEON E5530, 8 cores, 24GB RAM). Only the disk is not a SSD. Let me try to run it from a SSD drive.

janvorli commented 5 years ago

Oh, I've found that I was running some heavy tests in the background, I've forgotten about it. Now it doesn't timeout anymore. I am sorry for the confusion.

ayende commented 5 years ago

This test is spinning 5 servers internally which may hit the disk. If you have high disk activity and non SSD disk, that can certainly explain it. Great that this works now.

On my end, in 64 bits, it just works. In 32 bits, it fails after about 50 runs

janvorli commented 5 years ago

Hmm, it seems I was just lucky before. That test run has crashed after 4 iterations with stack overflow though. Attempt to re-run the test got the timeout again. So it seems I am really at an edge. Is there a way to increase that timeout? I would like to run it with GC stress enabled to trigger the issue more reliably and that will slow everything down a lot.

ayende commented 5 years ago

Try this:

$env:Raven.Cluster.OperationTimeoutInSec=300
$env:Raven.Cluster.ElectionTimeoutInMs=5000

By the way, nothing in this test should cause StackOverlow, I also run into that error (but couldn't reproduce in WinDBG so can't pull a stack trace).

I'm assuming that this is another instance of: https://github.com/dotnet/coreclr/issues/22597 but this this when you have the access violation on a stack pointer, which result in a different error.

janvorli commented 5 years ago

I have forgotten to ask - are you hitting this issue with both Debug and Release configurations?

benaadams commented 5 years ago

By the way, nothing in this test should cause StackOverlow, I also run into that error (but couldn't reproduce in WinDBG so can't pull a stack trace).

If you are throwing exceptions in (reasonably deep) async it could also be https://github.com/dotnet/roslyn/issues/26567

AndyAyersMS commented 5 years ago

My impression is that the issue behind dotnet/runtime#12038 will only cause spurious AVs in a short window just after calling VirtualAlloc (as the 2.1 jit does when it needs more memory) on that particular osver/kb combination. I would not expect it to lead to a stack overflow.

ayende commented 5 years ago

@janvorli I'm sorry, I forgot to mention that. I'm hitting that in Release configuration. Now that I think about it, the stack overflow I got was when I run it in Debug configuration,not Release

@benaadams We do have a lot of async calls, yes.

@AndyAyersMS I'm not sure what the root cause is for this, but wouldn't calling VirtualAlloc to commit additional stack space and then getting AV on that cause SO?

AndyAyersMS commented 5 years ago

wouldn't calling VirtualAlloc to commit additional stack space and then getting AV on that cause SO?

VirtualAlloc'd memory and stack are in different parts of virtual address space. Stack overflow is usually triggered by accessing a reserved guard page at end of stack. So, no, not likely they'd interfere with one another.

janvorli commented 5 years ago

@ayende with COMPlus_GCStress=4, I've hit an interesting exception that is probably unrelated to the issue we are hunting, but that might indicate a bug in your code. So I am sharing it in case you'd like to double check that:

0
        To attach debugger to test process (x86), use proc-id: 25716.
Max number of concurrent tests is: 4

Unhandled Exception: System.AggregateException: One or more errors occurred. (Cannot access a disposed object.
Object name: 'Leader'.) ---> Sparrow.Utils.LockAlreadyDisposedException: Cannot access a disposed object.
Object name: 'Leader'.
   at Sparrow.Utils.DisposeLock.ThrowDisposed() in E:\issues\ayende\ravendb\src\Sparrow\Utils\DisposeLock.cs:line 75
   at Sparrow.Utils.DisposeLock.EnsureNotDisposed() in E:\issues\ayende\ravendb\src\Sparrow\Utils\DisposeLock.cs:line 60
   at Raven.Server.Rachis.Leader.TryModifyTopology(String nodeTag, String nodeUrl, TopologyModification modification, Task& task, Boolean validateNotInTopology, Action`1 beforeCommit) in E:\issues\ayende\ravendb\src\Raven.Server\Rachis\Leader.cs:line 890
   at Raven.Server.Rachis.RachisConsensus.ModifyTopologyAsync(String nodeTag, String nodeUrl, TopologyModification modification, Boolean validateNotInTopology) in E:\issues\ayende\ravendb\src\Raven.Server\Rachis\RachisConsensus.cs:line 1761
   at Raven.Client.Extensions.TaskExtensions.<>c.<WithCancellation>b__2_0(Task t) in E:\issues\ayende\ravendb\src\Raven.Client\Extensions\TaskExtensions.cs:line 28
   at System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__278_1(Object obj)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
--- End of stack trace from previous location where exception was thrown ---
   at Raven.Server.ServerWide.ServerStore.AddNodeToClusterAsync(String nodeUrl, String nodeTag, Boolean validateNotInTopology, Boolean asWatcher) in E:\issues\ayende\ravendb\src\Raven.Server\ServerWide\ServerStore.cs:line 418
   at Tests.Infrastructure.ClusterTestBase.CreateRaftCluster(Int32 numberOfNodes, Boolean shouldRunInMemory, Nullable`1 leaderIndex, Boolean useSsl, IDictionary`2 customSettings) in E:\issues\ayende\ravendb\test\Tests.Infrastructure\ClusterTestBase.cs:line 479
   at Tests.Infrastructure.ClusterTestBase.CreateRaftClusterAndGetLeader(Int32 numberOfNodes, Boolean shouldRunInMemory, Nullable`1 leaderIndex, Boolean useSsl, IDictionary`2 customSettings) in E:\issues\ayende\ravendb\test\Tests.Infrastructure\ClusterTestBase.cs:line 495
   at SlowTests.Cluster.ClusterTransactionTests.CanPreformSeveralClusterTransactions(Int32 numberOfNodes) in E:\issues\ayende\ravendb\test\SlowTests\Cluster\ClusterTransactionTests.cs:line 114
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at Tryouts.Program.Main(String[] args) in E:\issues\ayende\ravendb\test\Tryouts\Program.cs:line 32
janvorli commented 5 years ago

I keep getting the stack overflow even in Release builds. So I've tried to run it under WinDbg and I can see it getting constant stream of exceptions (I guess at least 10 per second or even more):

Exception type:   System.Net.Sockets.SocketException
Message:          A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
InnerException:   <none>

At the following call stack:
 # ChildEBP RetAddr  
00 1c0feafc 6da02d94 KERNELBASE!RaiseException+0x62 [minkernel\kernelbase\xcpt.c @ 938] 
01 1c0feba4 6dad48fb coreclr!RaiseTheExceptionInternalOnly+0x11d [e:\a\_work\335\s\src\vm\excep.cpp @ 3039] 
02 1c0fec6c 656c5954 coreclr!IL_Throw+0x11b [e:\a\_work\335\s\src\vm\jithelpers.cpp @ 4860] 
03 1c0fec9c 0c92a181 System_Net_Sockets!System.Net.Sockets.NetworkStream.Read(Byte[], Int32, Int32)+0xb0
04 1c0fed28 0c929a5e Sparrow!Sparrow.Json.JsonOperationContext.ParseToMemory(System.IO.Stream, System.String, UsageMode, ManagedPinnedBuffer, Sparrow.Json.IBlittableDocumentModifier)+0x1c1
05 1c0fed6c 0ca02b30 Sparrow!Sparrow.Json.JsonOperationContext.ParseToMemory(System.IO.Stream, System.String, UsageMode, Sparrow.Json.IBlittableDocumentModifier)+0x46
06 1c0fee40 0ca0262b Raven_Server!Raven.Server.ServerWide.Maintenance.ClusterMaintenanceSupervisor+ClusterNode.ListenToMaintenanceWorker()+0x490
07 1c0fee6c 0ca27390 Raven_Server!Raven.Server.ServerWide.Maintenance.ClusterMaintenanceSupervisor+ClusterNode.<Start>b__18_0(System.Object)+0x23
08 1c0feea0 0ca26e10 Raven_Server!Raven.Server.Utils.PoolOfThreads+PooledThread.DoWork()+0xb8
09 1c0feec8 6ae768dd Raven_Server!Raven.Server.Utils.PoolOfThreads+PooledThread.Run()+0xc8
0a 1c0feee4 6cedc00c System_Threading_Thread!System.Threading.Thread.ThreadMain_ThreadStart()+0x35
0b 1c0feee4 6ceda31d System_Private_CoreLib!System.Threading.ThreadHelper.ThreadStart_Context(System.Object)+0x5c
0c 1c0fef14 6cedc0cc System_Private_CoreLib!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)+0x65
0d 1c0fef28 6daeb0ef System_Private_CoreLib!System.Threading.ThreadHelper.ThreadStart()+0x2c
0e 1c0fef34 6da1fbf1 coreclr!CallDescrWorkerInternal+0x34 [E:\A\_work\335\s\src\vm\i386\asmhelpers.asm @ 618] 
0f (Inline) -------- coreclr!CallDescrWorkerWithHandler+0x52 [e:\a\_work\335\s\src\vm\callhelpers.cpp @ 78] 
10 1c0fefd4 6dadd4e4 coreclr!MethodDescCallSite::CallTargetWorker+0x235 [e:\a\_work\335\s\src\vm\callhelpers.cpp @ 620] 
11 1c0ff0ac 6daf2b06 coreclr!ThreadNative::KickOffThread_Worker+0x104 [e:\a\_work\335\s\src\vm\comsynchronizable.cpp @ 260] 
12 1c0ff0c4 6da1f86a coreclr!ManagedThreadBase_DispatchInner+0x70 [e:\a\_work\335\s\src\vm\threads.cpp @ 8852] 
13 1c0ff170 6da1f7bb coreclr!ManagedThreadBase_DispatchMiddle+0x65 [e:\a\_work\335\s\src\vm\threads.cpp @ 8902] 
14 1c0ff1d4 6dae52b9 coreclr!ManagedThreadBase_DispatchOuter+0x78 [e:\a\_work\335\s\src\vm\threads.cpp @ 9161] 
15 1c0ff1f8 6da89e7c coreclr!ManagedThreadBase_FullTransitionWithAD+0x2f [e:\a\_work\335\s\src\vm\threads.cpp @ 9200] 
16 (Inline) -------- coreclr!ManagedThreadBase::KickOff+0x15 [e:\a\_work\335\s\src\vm\threads.cpp @ 9234] 
17 1c0ff274 6da89d90 coreclr!ThreadNative::KickOffThread+0xcc [e:\a\_work\335\s\src\vm\comsynchronizable.cpp @ 380] 
18 1c0ffe8c 76f90179 coreclr!Thread::intermediateThreadProc+0x50 [e:\a\_work\335\s\src\vm\threads.cpp @ 2255] 
19 1c0ffe9c 771c662d KERNEL32!BaseThreadInitThunk+0x19 [base\win32\client\thread.c @ 64] 
1a 1c0ffef8 771c65fd ntdll!__RtlUserThreadStart+0x2f [minkernel\ntdll\rtlstrt.c @ 1163] 
1b 1c0fff08 00000000 ntdll!_RtlUserThreadStart+0x1b [minkernel\ntdll\rtlstrt.c @ 1080] 
janvorli commented 5 years ago

I have tried to run it many times and I keep getting the stack overflow. Sometimes the test can make 10 or 20 iterations and then it crashes with the stack overflow.

ayende commented 5 years ago

@janvorli The lock dispose issue is fine. We probably timed out in creating the cluster and then timedout, but still had actions scheduled to do work on that. The exception is expected and in real code (not this test) is handled.

The network exception is also expected, this is something that happen when we spin up / down nodes.

ayende commented 5 years ago

Given the difficulties in reproducing this on your end. Is there anyway I can gather more information on my side?

janvorli commented 5 years ago

There is one thing. I would need to enable stresslog so that information on what was moved by GC and when is recorded in the dump. Can you please try to repro it with the following env variables set?

COMPlus_LogLevel=7 
COMPlus_LogFacility=80103 
COMPlus_StressLogSize=2000000 
COMPlus_TotalStressLogSize=40000000 
CORECLR_CONCURRENT_GC=0 
CORECLR_SERVER_GC=0

The last two variables don't have to be set if you cannot repro it with those. It would just make looking into the results easier.

If you could repro it and share the full dump with me, I would be able to get more understanding on the issue.

ayende commented 5 years ago

About the stack overflow thing, I got it as well.

(6e08.5d64): Stack overflow - code c00000fd (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=0c963190 ebx=0c9633f0 ecx=0c963da0 edx=00000002 esi=0c963da0 edi=0c96341c
eip=0f8324ae esp=0c962f6c ebp=0c9633d0 iopl=0         nv up ei pl nz ac pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010216
coreclr!_EH_prolog3_GS+0x10:
0f8324ae 53              push    ebx

I think this is related to something we did. Investigating this now

janvorli commented 5 years ago

Btw, I was able to make it build with 3.0 SDK (still targetting 2.1). I just needed to add the following package reference to Raven.Server.csproj:

    <PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" Version="2.1.0" />

After that I've tried to update the Tryouts.csproj to target 3.0 and after fixing an issue with SqlCommandTimeout causing crash in ConfigurationCategory.cs at line 253 (the defaultValue was empty string for some reason on 3.0) and follow up crash that was somehow caused by the DefaultValue attribute being defined as typeof(string), null (changed that to an integer as a quick fix), I was able to run it with 3.0. Still getting the stack overflow there. The reason for this exercise was to check if the issue (GC hole) still happens in 3.0 or if it was fixed already. Btw, when the DefaultValue is set to typeof(string), null, the second failure I've mentioned was

Unhandled Exception: System.AggregateException: One or more errors occurred. (Object of type 'System.String' cannot be converted to type 'System.Nullable`1[Raven.Server.Config.Settings.TimeSetting]'.) ---> System.ArgumentException: Object of type 'System.String' cannot be converted to type 'System.Nullable`1[Raven.Server.Config.Settings.TimeSetting]'.

Not sure if .NET Core 3.0 has become stricter here or if both issues I've seen with the SqlCommandTimeout are a bug in 3.0. My expertise is mostly in the native parts of the runtime.

janvorli commented 5 years ago

@ayende I've added Console.WriteLine($"HandleUnsuccessfulResponse {response.StatusCode}"); at the beginning of the HandleUnsuccessfulResponse. I can see that when the stack overflow happens, there are thousands of HandleUnsuccessfulResponse Gone printed to the console. In the iterations that succeeded before the stack overflow, I see other HandleUnsuccessfulResponse, but never with the "Gone".

ayende commented 5 years ago

We found the cause for the SO, yes. I'll have a branch ready with just this fix for you in 15 minutes

ayende commented 5 years ago

Here is the branch without the SO issue: https://github.com/ayende/ravendb/tree/av-repro

janvorli commented 5 years ago

Unfortunately I got stack overflow with this version too. Although it has taken much longer than before (72 iterations), the call stack (at least the repeating part looks the same as before.)

But, the good news is that with COMPlus_GCStress=4, I was able to repro the issue quite quickly. However, the Sparrow.Utils.LockAlreadyDisposedException: Cannot access a disposed object. keeps kicking in quite often. Could you please recommend me a good place in your code where I can swallow it so that it doesn't get in the way of my testing?

janvorli commented 5 years ago

@ayende I've forgotten to mention one more env vars that needs to be set to get the stresslog:

COMPlus_StressLog=1

However, since I can repro it now, I don't need a crash dump from you anymore. So I just mention it for completeness and your future reference should you need it.

ayende commented 5 years ago

I'm afraid that LockAlreadyDisposedException is actually expected in this scenario. I changed my code to just try/catch around the test run in Program.cs and keep running it. That allow me to ignore test failures / errors while searching for the actual issue. Really happy that you were able to reproduce this on your side. It is a GC hole, right?

janvorli commented 5 years ago

Yes, it is a GC hole. And I've just verified it is still present in 3.0. I've ended up placing try catch around the test too and so I can repro the issue reliably in the first successful iteration (with COMPlus_GCStress= 4 set). Now I need to investigate whether the issue causing the hole is in JIT or in EH code.

ayende commented 5 years ago

Is this limited to x86, or should we worry about this on 64 bits as well?

janvorli commented 5 years ago

I will try to run it with the GC stress enabled on x64 to figure that out.

arekpalinski commented 5 years ago

It seems we're experiencing this issue on x64 machine as well:

(3448.3530): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
coreclr!VirtualCallStubManager::ResolveWorker+0x6c:
00007ffe`6789600c 488b4618        mov     rax,qword ptr [rsi+18h] ds:00027d3c`d612b6a5=????????????????
0:031> k
 # Child-SP          RetAddr           Call Site
00 000000ad`0863d4c0 00007ffe`67895c36 coreclr!VirtualCallStubManager::ResolveWorker+0x6c [e:\a\_work\104\s\src\vm\virtualcallstub.cpp @ 1712] 
01 000000ad`0863d9e0 00007ffe`679e105d coreclr!VSD_ResolveWorker+0x2f6 [e:\a\_work\104\s\src\vm\virtualcallstub.cpp @ 1611] 
Unable to read dynamic function table entries
Unable to read dynamic function table entries
02 000000ad`0863db50 00007ffe`08f8f64f coreclr!ResolveWorkerAsmStub+0x6d [E:\A\_work\104\s\src\vm\amd64\VirtualCallStubAMD64.asm @ 40] 
Unable to read dynamic function table entries
Unable to read dynamic function table entries
Unable to read dynamic function table entries
03 000000ad`0863dc00 000001df`bb0f0078 0x00007ffe`08f8f64f
Unable to read dynamic function table entries
04 000000ad`0863dc08 000001e2`39800040 0x000001df`bb0f0078
05 000000ad`0863dc10 00000000`00003cd8 0x000001e2`39800040
06 000000ad`0863dc18 fffffffd`818f1d08 0x3cd8
07 000000ad`0863dc20 000000ad`0863dc68 0xfffffffd`818f1d08
08 000000ad`0863dc28 000000ad`0863dc60 0x000000ad`0863dc68
09 000000ad`0863dc30 000000ad`0863dc00 0x000000ad`0863dc60
0a 000000ad`0863dc38 000001e2`07834ca0 0x000000ad`0863dc00
0b 000000ad`0863dc40 000001df`bb0e6360 0x000001e2`07834ca0
0c 000000ad`0863dc48 000001df`bb0f0078 0x000001df`bb0e6360
0d 000000ad`0863dc50 000001e2`388916e7 0x000001df`bb0f0078
0e 000000ad`0863dc58 00000000`00000000 0x000001e2`388916e7
0:031> !clrstack
No export clrstack found
0:031> .load C:\Program Files\dotnet\shared\Microsoft.NETCore.App\2.1.8\sos.dll
0:031> !clrstack
OS Thread Id: 0x3530 (31)
        Child SP               IP Call Site
000000ad0863dac0 00007ffe6789600c [GCFrame: 000000ad0863dac0] 
000000ad0863da28 00007ffe6789600c [StubDispatchFrame: 000000ad0863da28] System.IDisposable.Dispose()
000000ad0863dc00 00007ffe08f8f64f Voron.Data.Fixed.FixedSizeTree.AddEmbeddedEntry(Int64, Boolean ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Fixed\FixedSizeTree.cs @ 666]
000000ad0863dca0 00007ffe08f6fbc3 Voron.Data.Fixed.FixedSizeTree.DirectAdd(Int64, Boolean ByRef, Byte* ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Fixed\FixedSizeTree.cs @ 250]
000000ad0863dcf0 00007ffe08f6f984 Voron.Data.Fixed.FixedSizeTree.Add(Int64, Voron.Slice) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Fixed\FixedSizeTree.cs @ 218]
000000ad0863dd60 00007ffe08f9221a Voron.Data.Tables.Table.InsertIndexValuesFor(Int64, Voron.Data.Tables.TableValueReader ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Tables\Table.cs @ 650]
000000ad0863de60 00007ffe0905c39d Voron.Data.Tables.Table.Insert(Voron.Data.Tables.TableValueBuilder) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Tables\Table.cs @ 496]
000000ad0863df30 00007ffe0a41cbdc Raven.Server.Documents.Revisions.RevisionsStorage.Put(Raven.Server.ServerWide.Context.DocumentsOperationContext, System.String, Sparrow.Json.BlittableJsonReaderObject, Raven.Server.Documents.DocumentFlags, Raven.Server.Documents.NonPersistentDocumentFlags, System.String, Int64, Raven.Client.Documents.Operations.Revisions.RevisionsCollectionConfiguration, Raven.Server.Documents.CollectionName) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\Revisions\RevisionsStorage.cs @ 381]
000000ad0863e490 00007ffe0a413a3a Raven.Server.Documents.Replication.IncomingReplicationHandler+MergedDocumentReplicationCommand.ExecuteCmd(Raven.Server.ServerWide.Context.DocumentsOperationContext) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\Replication\IncomingReplicationHandler.cs @ 1292]
000000ad0863e7a0 00007ffe0a406513 Raven.Server.Documents.TransactionOperationsMerger.ExecutePendingOperationsInTransaction(System.Collections.Generic.List`1, Raven.Server.ServerWide.Context.DocumentsOperationContext, System.Threading.Tasks.Task, DurationMeasurement ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\TransactionOperationsMerger.cs @ 827]
000000ad0863e820 00007ffe0a4057cf Raven.Server.Documents.TransactionOperationsMerger.MergeTransactionsOnce() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\TransactionOperationsMerger.cs @ 506]
000000ad0863e930 00007ffe08fc1046 Raven.Server.Documents.TransactionOperationsMerger.MergeOperationThreadProc() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\TransactionOperationsMerger.cs @ 320]
000000ad0863e9e0 00007ffe08c8385f Raven.Server.Utils.PoolOfThreads+PooledThread.DoWork() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Utils\PoolOfThreads.cs @ 207]
000000ad0863ea40 00007ffe08c82beb Raven.Server.Utils.PoolOfThreads+PooledThread.Run() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Utils\PoolOfThreads.cs @ 178]
000000ad0863ea90 00007ffe80597080 System.Threading.Thread.ThreadMain_ThreadStart()
000000ad0863eac0 00007ffe5dcdd419 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
000000ad0863ed28 00007ffe679e0473 [GCFrame: 000000ad0863ed28] 
000000ad0863efa0 00007ffe679e0473 [DebuggerU2MCatchHandlerFrame: 000000ad0863efa0] 

This is on Windows Server 2019 Datacenter (version 1809):

PS C:\Users\Administrator> [System.Environment]::OSVersion.Version

Major  Minor  Build  Revision
-----  -----  -----  --------
10     0      17763  0

PS C:\Users\Administrator> Get-HotFix

Source        Description      HotFixID      InstalledBy          InstalledOn
------        -----------      --------      -----------          -----------
EC2AMAZ-I7... Update           KB4483452     NT AUTHORITY\SYSTEM  2/13/2019 12:00:00 AM
EC2AMAZ-I7... Update           KB4462930     NT AUTHORITY\SYSTEM  12/12/2018 12:00:00 AM
EC2AMAZ-I7... Update           KB4470502     NT AUTHORITY\SYSTEM  12/12/2018 12:00:00 AM
EC2AMAZ-I7... Security Update  KB4470788     NT AUTHORITY\SYSTEM  12/12/2018 12:00:00 AM
EC2AMAZ-I7... Security Update  KB4471331     NT AUTHORITY\SYSTEM  12/12/2018 12:00:00 AM
EC2AMAZ-I7... Update           KB4480056     NT AUTHORITY\SYSTEM  1/9/2019 12:00:00 AM
EC2AMAZ-I7... Security Update  KB4480979     NT AUTHORITY\SYSTEM  1/9/2019 12:00:00 AM
EC2AMAZ-I7... Security Update  KB4487038     NT AUTHORITY\SYSTEM  2/13/2019 12:00:00 AM
EC2AMAZ-I7... Security Update  KB4487044     NT AUTHORITY\SYSTEM  2/26/2019 12:00:00 AM
janvorli commented 5 years ago

@arekpalinski yes, it may be the same issue. Would you be able to share the crash dump?

ayende commented 5 years ago

@janvorli The advantage of this one is that it is very quickly reproducable. The disadvantage is that it is a production system (we deployed a new server). Is there a way to send you the dump file privately?

janvorli commented 5 years ago

In that case, I will rather ask you to extract some stuff for me. Can you please run the following sos commands and share the output?

!u -gcinfo 00007ffe6789600c
!gcinfo 00007ffe6789600c
!ehinfo 00007ffe6789600c
!ip2md 00007ffe6789600c

The last command will dump MethodDesc value. With that value, please run

!dumpil the_method_desc_value
arekpalinski commented 5 years ago

I reproduced it once again and took the dump of process which we can share privately if necessary. It was reproduced with the following env variables set:

COMPlus_LogLevel=7 
COMPlus_LogFacility=80103 
COMPlus_StressLogSize=2000000 
COMPlus_TotalStressLogSize=40000000 
CORECLR_CONCURRENT_GC=0 
CORECLR_SERVER_GC=0

Now WinDbg output is as follow:

fa4.30f0): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
coreclr!VirtualCallStubManager::ResolveWorker+0x6c:
00007ffe`6d37600c 488b4618        mov     rax,qword ptr [rsi+18h] ds:00027d3c`d612b6a5=????????????????
0:037> .load C:\Program Files\dotnet\shared\Microsoft.NETCore.App\2.1.8\sos.dll
0:037> !clrstack
OS Thread Id: 0x30f0 (37)
        Child SP               IP Call Site
000000707f1bd790 00007ffe6d37600c [GCFrame: 000000707f1bd790] 
000000707f1bd6f8 00007ffe6d37600c [StubDispatchFrame: 000000707f1bd6f8] System.IDisposable.Dispose()
000000707f1bd8d0 00007ffe0ea8cbaf Voron.Data.Fixed.FixedSizeTree.AddEmbeddedEntry(Int64, Boolean ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Fixed\FixedSizeTree.cs @ 666]
000000707f1bd970 00007ffe0ea410d3 Voron.Data.Fixed.FixedSizeTree.DirectAdd(Int64, Boolean ByRef, Byte* ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Fixed\FixedSizeTree.cs @ 250]
000000707f1bd9c0 00007ffe0ea40e94 Voron.Data.Fixed.FixedSizeTree.Add(Int64, Voron.Slice) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Fixed\FixedSizeTree.cs @ 218]
000000707f1bda30 00007ffe0ea8faea Voron.Data.Tables.Table.InsertIndexValuesFor(Int64, Voron.Data.Tables.TableValueReader ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Tables\Table.cs @ 650]
000000707f1bdb30 00007ffe0f8f4d8d Voron.Data.Tables.Table.Insert(Voron.Data.Tables.TableValueBuilder) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Voron\Data\Tables\Table.cs @ 496]
000000707f1bdc00 00007ffe0feffc1c Raven.Server.Documents.Revisions.RevisionsStorage.Put(Raven.Server.ServerWide.Context.DocumentsOperationContext, System.String, Sparrow.Json.BlittableJsonReaderObject, Raven.Server.Documents.DocumentFlags, Raven.Server.Documents.NonPersistentDocumentFlags, System.String, Int64, Raven.Client.Documents.Operations.Revisions.RevisionsCollectionConfiguration, Raven.Server.Documents.CollectionName) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\Revisions\RevisionsStorage.cs @ 381]
000000707f1be160 00007ffe0ff2a99a Raven.Server.Documents.Replication.IncomingReplicationHandler+MergedDocumentReplicationCommand.ExecuteCmd(Raven.Server.ServerWide.Context.DocumentsOperationContext) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\Replication\IncomingReplicationHandler.cs @ 1292]
000000707f1be470 00007ffe0fef0643 Raven.Server.Documents.TransactionOperationsMerger.ExecutePendingOperationsInTransaction(System.Collections.Generic.List`1, Raven.Server.ServerWide.Context.DocumentsOperationContext, System.Threading.Tasks.Task, DurationMeasurement ByRef) [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\TransactionOperationsMerger.cs @ 827]
000000707f1be4f0 00007ffe0feef98f Raven.Server.Documents.TransactionOperationsMerger.MergeTransactionsOnce() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\TransactionOperationsMerger.cs @ 506]
000000707f1be600 00007ffe0ea914a6 Raven.Server.Documents.TransactionOperationsMerger.MergeOperationThreadProc() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Documents\TransactionOperationsMerger.cs @ 320]
000000707f1be6b0 00007ffe0e76344f Raven.Server.Utils.PoolOfThreads+PooledThread.DoWork() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Utils\PoolOfThreads.cs @ 207]
000000707f1be710 00007ffe0e762feb Raven.Server.Utils.PoolOfThreads+PooledThread.Run() [C:\Builds\RavenDB-4.1-Nightly\20190228-0530\src\Raven.Server\Utils\PoolOfThreads.cs @ 178]
000000707f1be760 00007ffe7d7c7080 System.Threading.Thread.ThreadMain_ThreadStart()
000000707f1be790 00007ffe6bcad419 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
000000707f1be9f8 00007ffe6d4c0473 [GCFrame: 000000707f1be9f8] 
000000707f1bec70 00007ffe6d4c0473 [DebuggerU2MCatchHandlerFrame: 000000707f1bec70] 

0:037> k
 # Child-SP          RetAddr           Call Site
00 00000070`7f1bd190 00007ffe`6d375c36 coreclr!VirtualCallStubManager::ResolveWorker+0x6c [e:\a\_work\104\s\src\vm\virtualcallstub.cpp @ 1712] 
01 00000070`7f1bd6b0 00007ffe`6d4c105d coreclr!VSD_ResolveWorker+0x2f6 [e:\a\_work\104\s\src\vm\virtualcallstub.cpp @ 1611] 
02 00000070`7f1bd820 00007ffe`0ea8cbaf coreclr!ResolveWorkerAsmStub+0x6d [E:\A\_work\104\s\src\vm\amd64\VirtualCallStubAMD64.asm @ 40] 
03 00000070`7f1bd8d0 00007ffe`0ea410d3 0x00007ffe`0ea8cbaf
04 00000070`7f1bd970 00007ffe`0ea40e94 0x00007ffe`0ea410d3
05 00000070`7f1bd9c0 00007ffe`0ea8faea 0x00007ffe`0ea40e94
06 00000070`7f1bda30 00007ffe`0f8f4d8d 0x00007ffe`0ea8faea
07 00000070`7f1bdb30 00007ffe`0feffc1c 0x00007ffe`0f8f4d8d
08 00000070`7f1bdc00 00007ffe`0ff2a99a 0x00007ffe`0feffc1c
09 00000070`7f1be160 00007ffe`0fef0643 0x00007ffe`0ff2a99a
0a 00000070`7f1be470 00007ffe`0feef98f 0x00007ffe`0fef0643
0b 00000070`7f1be4f0 00007ffe`0ea914a6 0x00007ffe`0feef98f
0c 00000070`7f1be600 00007ffe`0e76344f 0x00007ffe`0ea914a6
0d 00000070`7f1be6b0 00007ffe`0e762feb 0x00007ffe`0e76344f
0e 00000070`7f1be710 00007ffe`7d7c7080 0x00007ffe`0e762feb
0f 00000070`7f1be760 00007ffe`6bcad419 System_Threading_Thread!System.Threading.Thread.ThreadMain_ThreadStart()$##600001C+0x40
10 00000070`7f1be790 00007ffe`6d4c0473 System_Private_CoreLib!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)$##60027C0+0x89
11 00000070`7f1be810 00007ffe`6d3b0db9 coreclr!CallDescrWorkerInternal+0x83 [E:\A\_work\104\s\src\vm\amd64\CallDescrWorkerAMD64.asm @ 101] 
12 (Inline Function) --------`-------- coreclr!CallDescrWorkerWithHandler+0x53 [e:\a\_work\104\s\src\vm\callhelpers.cpp @ 78] 
13 00000070`7f1be850 00007ffe`6d4ae343 coreclr!MethodDescCallSite::CallTargetWorker+0x2b5 [e:\a\_work\104\s\src\vm\callhelpers.cpp @ 628] 
14 (Inline Function) --------`-------- coreclr!StressLog::InlinedStressLogOn+0x5 [e:\a\_work\104\s\src\utilcode\stresslog.cpp @ 594] 
15 (Inline Function) --------`-------- coreclr!StressLog::LogOn+0x5 [e:\a\_work\104\s\src\utilcode\stresslog.cpp @ 628] 
16 00000070`7f1be9a0 00007ffe`6d3b0951 coreclr!ThreadNative::KickOffThread_Worker+0x143 [e:\a\_work\104\s\src\vm\comsynchronizable.cpp @ 260] 
17 (Inline Function) --------`-------- coreclr!ManagedThreadBase_DispatchInner+0xf0 [e:\a\_work\104\s\src\vm\threads.cpp @ 8850] 
18 00000070`7f1beb10 00007ffe`6d3b0743 coreclr!ManagedThreadBase_DispatchMiddle+0x179 [e:\a\_work\104\s\src\vm\threads.cpp @ 8901] 
19 00000070`7f1bec40 00007ffe`6d4b863f coreclr!ManagedThreadBase_DispatchOuter+0xaf [e:\a\_work\104\s\src\vm\threads.cpp @ 9140] 
1a 00000070`7f1bece0 00007ffe`6d3fc47a coreclr!ManagedThreadBase_FullTransitionWithAD+0x2f [e:\a\_work\104\s\src\vm\threads.cpp @ 9200] 
1b (Inline Function) --------`-------- coreclr!ManagedThreadBase::KickOff+0x20 [e:\a\_work\104\s\src\vm\threads.cpp @ 9234] 
1c 00000070`7f1bed40 00007ffe`6d3fc34a coreclr!ThreadNative::KickOffThread+0x10a [e:\a\_work\104\s\src\vm\comsynchronizable.cpp @ 380] 
1d 00000070`7f1bee20 00007ffe`913381f4 coreclr!Thread::intermediateThreadProc+0x8a [e:\a\_work\104\s\src\vm\threads.cpp @ 2255] 
1e 00000070`7f1bf860 00007ffe`93d3a251 KERNEL32!BaseThreadInitThunk+0x14
1f 00000070`7f1bf890 00000000`00000000 ntdll!RtlUserThreadStart+0x21
0:037> !u -gcinfo 00007ffe6d37600c
Unmanaged code
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1658
00007ffe`6d375fa0 4c894c2420      mov     qword ptr [rsp+20h],r9
00007ffe`6d375fa5 4c89442418      mov     qword ptr [rsp+18h],r8
00007ffe`6d375faa 4889542410      mov     qword ptr [rsp+10h],rdx
00007ffe`6d375faf 48894c2408      mov     qword ptr [rsp+8],rcx
00007ffe`6d375fb4 53              push    rbx
00007ffe`6d375fb5 56              push    rsi
00007ffe`6d375fb6 57              push    rdi
00007ffe`6d375fb7 4154            push    r12
00007ffe`6d375fb9 4155            push    r13
00007ffe`6d375fbb 4156            push    r14
00007ffe`6d375fbd 4157            push    r15
00007ffe`6d375fbf 4881ece0040000  sub     rsp,4E0h
00007ffe`6d375fc6 48c78424f0030000feffffff mov qword ptr [rsp+3F0h],0FFFFFFFFFFFFFFFEh
00007ffe`6d375fd2 498bf9          mov     rdi,r9
00007ffe`6d375fd5 488bd9          mov     rbx,rcx
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1669
00007ffe`6d375fd8 498b00          mov     rax,qword ptr [r8]
00007ffe`6d375fdb 488b30          mov     rsi,qword ptr [rax]
00007ffe`6d375fde 4889742438      mov     qword ptr [rsp+38h],rsi
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1705
00007ffe`6d375fe3 4533ed          xor     r13d,r13d
00007ffe`6d375fe6 4c89ac2480000000 mov     qword ptr [rsp+80h],r13
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1707
00007ffe`6d375fee 458bf5          mov     r14d,r13d
00007ffe`6d375ff1 44896c2444      mov     dword ptr [rsp+44h],r13d
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1712
00007ffe`6d375ff6 488b8990040000  mov     rcx,qword ptr [rcx+490h]
00007ffe`6d375ffd 488b01          mov     rax,qword ptr [rcx]
00007ffe`6d376000 488b4018        mov     rax,qword ptr [rax+18h]
00007ffe`6d376004 ff15ee8b4100    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d37600a 85c0            test    eax,eax
00007ffe`6d37600c 488b4618        mov     rax,qword ptr [rsi+18h]
00007ffe`6d376010 488b4c3048      mov     rcx,qword ptr [rax+rsi+48h]
00007ffe`6d376015 0f85c20d0000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0xe3d (00007ffe`6d376ddd)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1715
00007ffe`6d376ddd 488b4178        mov     rax,qword ptr [rcx+78h]
00007ffe`6d376de1 4c8b80d8020000  mov     r8,qword ptr [rax+2D8h]
00007ffe`6d376de8 4c89842480000000 mov     qword ptr [rsp+80h],r8
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1719
00007ffe`6d376df0 4c3bc3          cmp     r8,rbx
00007ffe`6d376df3 410f95c6        setne   r14b
00007ffe`6d376df7 4489742444      mov     dword ptr [rsp+44h],r14d
00007ffe`6d376dfc e944f2ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xa5 (00007ffe`6d376045)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1722
00007ffe`6d4f6e8a 488b4618        mov     rax,qword ptr [rsi+18h]
00007ffe`6d4f6e8e 488b4c3048      mov     rcx,qword ptr [rax+rsi+48h]
00007ffe`6d4f6e93 488b4178        mov     rax,qword ptr [rcx+78h]
00007ffe`6d4f6e97 4c8b80d8020000  mov     r8,qword ptr [rax+2D8h]
00007ffe`6d4f6e9e 4c89842480000000 mov     qword ptr [rsp+80h],r8
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1723
00007ffe`6d4f6ea6 4c3bc3          cmp     r8,rbx
00007ffe`6d4f6ea9 7410            je      coreclr!VirtualCallStubManager::ResolveWorker+0x180f1b (00007ffe`6d4f6ebb)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1725
00007ffe`6d4f6eab 41be01000000    mov     r14d,1
00007ffe`6d4f6eb1 4489742444      mov     dword ptr [rsp+44h],r14d
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1727
00007ffe`6d4f6eb6 e98af1e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xa5 (00007ffe`6d376045)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1729
00007ffe`6d4f6ebb 4c89ac2480000000 mov     qword ptr [rsp+80h],r13
00007ffe`6d4f6ec3 e97df1e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xa5 (00007ffe`6d376045)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1733
00007ffe`6d376045 4489b42488000000 mov     dword ptr [rsp+88h],r14d
00007ffe`6d37604d 4489b42494000000 mov     dword ptr [rsp+94h],r14d
00007ffe`6d376055 ff837c050000    inc     dword ptr [rbx+57Ch]
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1740
00007ffe`6d37605b 4c896c2448      mov     qword ptr [rsp+48h],r13
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1741
00007ffe`6d376060 4c896c2450      mov     qword ptr [rsp+50h],r13
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1742
00007ffe`6d376065 44896c2440      mov     dword ptr [rsp+40h],r13d
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1752
00007ffe`6d3770ff 4c89ac2448010000 mov     qword ptr [rsp+148h],r13
00007ffe`6d377107 4489ac2450010000 mov     dword ptr [rsp+150h],r13d
00007ffe`6d37710f e9b1efffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x125 (00007ffe`6d3760c5)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1758
00007ffe`6d3760e1 488d0580304800  lea     rax,[coreclr!DispatchEntry::`vftable' (00007ffe`6d7f9168)]
00007ffe`6d3760e8 4889442458      mov     qword ptr [rsp+58h],rax
00007ffe`6d3760ed 4c896c2460      mov     qword ptr [rsp+60h],r13
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1759
00007ffe`6d3760f2 4c8d442458      lea     r8,[rsp+58h]
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1760
00007ffe`6d3770ba 488b8c24e0000000 mov     rcx,qword ptr [rsp+0E0h]
00007ffe`6d3770c2 4885c9          test    rcx,rcx
00007ffe`6d3770c5 0f842bfeffff    je      coreclr!VirtualCallStubManager::ResolveWorker+0xf56 (00007ffe`6d376ef6)
00007ffe`6d3770cb ba08000000      mov     edx,8
00007ffe`6d3770d0 e807ac1400      call    coreclr!operator delete (00007ffe`6d4c1cdc)
00007ffe`6d3770d5 e91cfeffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xf56 (00007ffe`6d376ef6)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1762
00007ffe`6d4f6ec8 418bc5          mov     eax,r13d
00007ffe`6d4f6ecb e985f3e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x2b5 (00007ffe`6d376255)
00007ffe`6d4f6ed0 b801000000      mov     eax,1
00007ffe`6d4f6ed5 e986f3e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x2c0 (00007ffe`6d376260)
00007ffe`6d4f6eda 488b4a1d        mov     rcx,qword ptr [rdx+1Dh]
00007ffe`6d4f6ede 48898c2400030000 mov     qword ptr [rsp+300h],rcx
00007ffe`6d4f6ee6 48898c2460020000 mov     qword ptr [rsp+260h],rcx
00007ffe`6d4f6eee e990f3e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x2e3 (00007ffe`6d376283)
00007ffe`6d4f6ef3 488d4c2458      lea     rcx,[rsp+58h]
00007ffe`6d4f6ef8 ff15fa7c2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f6efe 4c8b442458      mov     r8,qword ptr [rsp+58h]
00007ffe`6d4f6f03 e987f3e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x2ef (00007ffe`6d37628f)
00007ffe`6d4f6f08 ff15ea7c2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f6f0e 90              nop
00007ffe`6d4f6f0f e90cfee7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xd80 (00007ffe`6d376d20)
00007ffe`6d4f6f14 4c8b442438      mov     r8,qword ptr [rsp+38h]
00007ffe`6d4f6f19 488b942438050000 mov     rdx,qword ptr [rsp+538h]
00007ffe`6d4f6f21 488d4c2458      lea     rcx,[rsp+58h]
00007ffe`6d4f6f26 ff15cc7c2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f6f2c 90              nop
00007ffe`6d4f6f2d e96ef3e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x300 (00007ffe`6d3762a0)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1763
00007ffe`6d376d90 e96df5ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x362 (00007ffe`6d376302)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1765
00007ffe`6d4f6f32 418bc5          mov     eax,r13d
00007ffe`6d4f6f35 e925fee7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xdbf (00007ffe`6d376d5f)
00007ffe`6d4f6f3a b801000000      mov     eax,1
00007ffe`6d4f6f3f e926fee7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xdca (00007ffe`6d376d6a)
00007ffe`6d4f6f44 488b442460      mov     rax,qword ptr [rsp+60h]
00007ffe`6d4f6f49 488b4811        mov     rcx,qword ptr [rax+11h]
00007ffe`6d4f6f4d 48898c2418030000 mov     qword ptr [rsp+318h],rcx
00007ffe`6d4f6f55 e929fee7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xde3 (00007ffe`6d376d83)
00007ffe`6d4f6f5a 498bcd          mov     rcx,r13
00007ffe`6d4f6f5d e921fee7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xde3 (00007ffe`6d376d83)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1766
00007ffe`6d376d88 c744244001000000 mov     dword ptr [rsp+40h],1
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1774
00007ffe`6d376302 4885c9          test    rcx,rcx
00007ffe`6d376305 0f85fd010000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0x568 (00007ffe`6d376508)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1777
00007ffe`6d37630b 488d05362e4800  lea     rax,[coreclr!ComUtil::detail::DeriveTypeList<mpl::type_list<ComUtil::ItfBase<CLRPrivBinderUtil::CLRPrivResourceBase<ICLRPrivResourcePath> >,mpl::type_list<ICLRPrivResourcePath,mpl::null_type> > >::`vftable' (0000700007ffe`6d376312 4889442468      mov     qword ptr [rsp+68h],rax
00007ffe`6d376317 488d05fae74100  lea     rax,[coreclr!ResolveCacheEntry::`vftable' (00007ffe`6d794b18)]
00007ffe`6d37631e 4889442468      mov     qword ptr [rsp+68h],rax
00007ffe`6d376323 4c896c2470      mov     qword ptr [rsp+70h],r13
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1778
00007ffe`6d376328 4c8d442468      lea     r8,[rsp+68h]
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1779
00007ffe`6d377114 488b8c24f0000000 mov     rcx,qword ptr [rsp+0F0h]
00007ffe`6d37711c 4885c9          test    rcx,rcx
00007ffe`6d37711f 0f84a9feffff    je      coreclr!VirtualCallStubManager::ResolveWorker+0x102e (00007ffe`6d376fce)
00007ffe`6d377125 ba08000000      mov     edx,8
00007ffe`6d37712a e8adab1400      call    coreclr!operator delete (00007ffe`6d4c1cdc)
00007ffe`6d37712f e99afeffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x102e (00007ffe`6d376fce)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1781
00007ffe`6d4f6f62 488d4c2468      lea     rcx,[rsp+68h]
00007ffe`6d4f6f67 488bc2          mov     rax,rdx
00007ffe`6d4f6f6a ff15887c2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f6f70 488b4c2468      mov     rcx,qword ptr [rsp+68h]
00007ffe`6d4f6f75 e915f5e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x4ef (00007ffe`6d37648f)
00007ffe`6d4f6f7a ff15787c2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f6f80 90              nop
00007ffe`6d4f6f81 e9b3fce7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xc99 (00007ffe`6d376c39)
00007ffe`6d4f6f86 4c8b442438      mov     r8,qword ptr [rsp+38h]
00007ffe`6d4f6f8b 488b942438050000 mov     rdx,qword ptr [rsp+538h]
00007ffe`6d4f6f93 488d4c2468      lea     rcx,[rsp+68h]
00007ffe`6d4f6f98 ff155a7c2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f6f9e 4c8bbc2438050000 mov     r15,qword ptr [rsp+538h]
00007ffe`6d4f6fa6 e9f0f4e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x4fb (00007ffe`6d37649b)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1782
00007ffe`6d3764fa 4885db          test    rbx,rbx
00007ffe`6d3764fd 0f850a0a0000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0xf6d (00007ffe`6d376f0d)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1784
00007ffe`6d4f6fab 498bcd          mov     rcx,r13
00007ffe`6d4f6fae e96cffe7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xf7f (00007ffe`6d376f1f)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1785
00007ffe`6d376f24 c744244001000000 mov     dword ptr [rsp+40h],1
00007ffe`6d376f2c e9d2f5ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x563 (00007ffe`6d376503)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1790
00007ffe`6d4f6fb3 e82460f1ff      call    coreclr!UnwindFrameChain (00007ffe`6d40cfdc)
00007ffe`6d4f6fb8 90              nop
00007ffe`6d4f6fb9 e974f5e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x592 (00007ffe`6d376532)
00007ffe`6d4f6fbe 488b4310        mov     rax,qword ptr [rbx+10h]
00007ffe`6d4f6fc2 85d2            test    edx,edx
00007ffe`6d4f6fc4 742c            je      coreclr!VirtualCallStubManager::ResolveWorker+0x181052 (00007ffe`6d4f6ff2)
00007ffe`6d4f6fc6 4889842460010000 mov     qword ptr [rsp+160h],rax
00007ffe`6d4f6fce c7400c01000000  mov     dword ptr [rax+0Ch],1
00007ffe`6d4f6fd5 8b055d263e00    mov     eax,dword ptr [coreclr!g_TrapReturningThreads (00007ffe`6d8d9638)]
00007ffe`6d4f6fdb 85c0            test    eax,eax
00007ffe`6d4f6fdd 740e            je      coreclr!VirtualCallStubManager::ResolveWorker+0x18104d (00007ffe`6d4f6fed)
00007ffe`6d4f6fdf 488b8c2460010000 mov     rcx,qword ptr [rsp+160h]
00007ffe`6d4f6fe7 e8603cf7ff      call    coreclr!Thread::RareDisablePreemptiveGC (00007ffe`6d46ac4c)
00007ffe`6d4f6fec 90              nop
00007ffe`6d4f6fed e952f5e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x5a4 (00007ffe`6d376544)
00007ffe`6d4f6ff2 4889842468010000 mov     qword ptr [rsp+168h],rax
00007ffe`6d4f6ffa 4489680c        mov     dword ptr [rax+0Ch],r13d
00007ffe`6d4f6ffe 4883c008        add     rax,8
00007ffe`6d4f7002 4889842488030000 mov     qword ptr [rsp+388h],rax
00007ffe`6d4f700a 8b08            mov     ecx,dword ptr [rax]
00007ffe`6d4f700c 898c24d0010000  mov     dword ptr [rsp+1D0h],ecx
00007ffe`6d4f7013 f6c11f          test    cl,1Fh
00007ffe`6d4f7016 740e            je      coreclr!VirtualCallStubManager::ResolveWorker+0x181086 (00007ffe`6d4f7026)
00007ffe`6d4f7018 488b8c2468010000 mov     rcx,qword ptr [rsp+168h]
00007ffe`6d4f7020 e87745fcff      call    coreclr!Thread::RareEnablePreemptiveGC (00007ffe`6d4bb59c)
00007ffe`6d4f7025 90              nop
00007ffe`6d4f7026 e919f5e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x5a4 (00007ffe`6d376544)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1793
00007ffe`6d4f702b 4533ed          xor     r13d,r13d
00007ffe`6d4f702e 488b7c2448      mov     rdi,qword ptr [rsp+48h]
00007ffe`6d4f7033 8b8c2488000000  mov     ecx,dword ptr [rsp+88h]
00007ffe`6d4f703a 894c2444        mov     dword ptr [rsp+44h],ecx
00007ffe`6d4f703e e906f5e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x5a9 (00007ffe`6d376549)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1799
00007ffe`6d376549 488b442438      mov     rax,qword ptr [rsp+38h]
00007ffe`6d37654e 488d1df32b4800  lea     rbx,[coreclr!ComUtil::detail::DeriveTypeList<mpl::type_list<ComUtil::ItfBase<CLRPrivBinderUtil::CLRPrivResourceBase<ICLRPrivResourcePath> >,mpl::type_list<ICLRPrivResourcePath,mpl::null_type> > >::`vftable' (0000700007ffe`6d376555 4c8ba42438050000 mov     r12,qword ptr [rsp+538h]
00007ffe`6d37655d 48837c245000    cmp     qword ptr [rsp+50h],0
00007ffe`6d376563 751c            jne     coreclr!VirtualCallStubManager::ResolveWorker+0x5e1 (00007ffe`6d376581)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1802
00007ffe`6d376565 4c8d4c2450      lea     r9,[rsp+50h]
00007ffe`6d37656a 4c8b842430050000 mov     r8,qword ptr [rsp+530h]
00007ffe`6d376572 498bd4          mov     rdx,r12
00007ffe`6d376575 488bc8          mov     rcx,rax
00007ffe`6d376578 e8030e0000      call    coreclr!VirtualCallStubManager::Resolver (00007ffe`6d377380)
00007ffe`6d37657d 89442440        mov     dword ptr [rsp+40h],eax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1831
00007ffe`6d4f7043 4c89ac24c8000000 mov     qword ptr [rsp+0C8h],r13
00007ffe`6d4f704b 4489ac24d0000000 mov     dword ptr [rsp+0D0h],r13d
00007ffe`6d4f7053 e984f5e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x63c (00007ffe`6d3765dc)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1843
00007ffe`6d3765ec 44896c2430      mov     dword ptr [rsp+30h],r13d
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1845
00007ffe`6d3765f1 48837c245000    cmp     qword ptr [rsp+50h],0
00007ffe`6d3765f7 0f84e90b1800    je      coreclr!VirtualCallStubManager::ResolveWorker+0x181246 (00007ffe`6d4f71e6)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1847
00007ffe`6d3765fd 837c244000      cmp     dword ptr [rsp+40h],0
00007ffe`6d376602 0f84f7050000    je      coreclr!VirtualCallStubManager::ResolveWorker+0xc5f (00007ffe`6d376bff)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1851
00007ffe`6d376d35 448ba42440050000 mov     r12d,dword ptr [rsp+540h]
00007ffe`6d376d3d e964fdffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xb06 (00007ffe`6d376aa6)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1856
00007ffe`6d376619 488d05b8464800  lea     rax,[coreclr!ResolveEntry::`vftable' (00007ffe`6d7facd8)]
00007ffe`6d376620 48898424f0010000 mov     qword ptr [rsp+1F0h],rax
00007ffe`6d376628 4c89ac24f8010000 mov     qword ptr [rsp+1F8h],r13
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1857
00007ffe`6d376630 488d8424f0010000 lea     rax,[rsp+1F0h]
00007ffe`6d376638 48898424e8020000 mov     qword ptr [rsp+2E8h],rax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1876
00007ffe`6d3770df 488b8c2408010000 mov     rcx,qword ptr [rsp+108h]
00007ffe`6d3770e7 4885c9          test    rcx,rcx
00007ffe`6d3770ea 0f84b9f5ffff    je      coreclr!VirtualCallStubManager::ResolveWorker+0x709 (00007ffe`6d3766a9)
00007ffe`6d3770f0 ba08000000      mov     edx,8
00007ffe`6d3770f5 e8e2ab1400      call    coreclr!operator delete (00007ffe`6d4c1cdc)
00007ffe`6d3770fa e9aaf5ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x709 (00007ffe`6d3766a9)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1879
00007ffe`6d376712 488d8c24b0020000 lea     rcx,[rsp+2B0h]
00007ffe`6d37671a e861e91000      call    coreclr!Prober::Find (00007ffe`6d485080)
00007ffe`6d37671f 488bf8          mov     rdi,rax
00007ffe`6d376722 48898424e0030000 mov     qword ptr [rsp+3E0h],rax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1880
00007ffe`6d37672a 4885c0          test    rax,rax
00007ffe`6d37672d 0f841b050000    je      coreclr!VirtualCallStubManager::ResolveWorker+0xcae (00007ffe`6d376c4e)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1882
00007ffe`6d376c4e 4d8bcc          mov     r9,r12
00007ffe`6d376c51 488d1558a41400  lea     rdx,[coreclr!ResolveWorkerChainLookupAsmStub (00007ffe`6d4c10b0)]
00007ffe`6d376c58 498bcf          mov     rcx,r15
00007ffe`6d376c5b e8d06a0c00      call    coreclr!VirtualCallStubManager::GenerateResolveStub (00007ffe`6d43d730)
00007ffe`6d376c60 488bf8          mov     rdi,rax
00007ffe`6d376c63 4889842480010000 mov     qword ptr [rsp+180h],rax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1888
00007ffe`6d376c6b 4c8d8424b0020000 lea     r8,[rsp+2B0h]
00007ffe`6d376c73 488bd0          mov     rdx,rax
00007ffe`6d376c76 498b8f30050000  mov     rcx,qword ptr [r15+530h]
00007ffe`6d376c7d e8927a0c00      call    coreclr!BucketTable::Add (00007ffe`6d43e714)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1890
00007ffe`6d376c82 e9b4faffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x79b (00007ffe`6d37673b)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1892
00007ffe`6d376733 4889842480010000 mov     qword ptr [rsp+180h],rax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1895
00007ffe`6d37673b 48897c2448      mov     qword ptr [rsp+48h],rdi
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1905
00007ffe`6d376740 4885ff          test    rdi,rdi
00007ffe`6d376743 0f8462040000    je      coreclr!VirtualCallStubManager::ResolveWorker+0xc0b (00007ffe`6d376bab)
00007ffe`6d376749 448ba42440050000 mov     r12d,dword ptr [rsp+540h]
00007ffe`6d376751 4183fc01        cmp     r12d,1
00007ffe`6d376755 0f8537030000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0xaf2 (00007ffe`6d376a92)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1907
00007ffe`6d37675b 488d05e6294800  lea     rax,[coreclr!ComUtil::detail::DeriveTypeList<mpl::type_list<ComUtil::ItfBase<CLRPrivBinderUtil::CLRPrivResourceBase<ICLRPrivResourcePath> >,mpl::type_list<ICLRPrivResourcePath,mpl::null_type> > >::`vftable' (0000700007ffe`6d376762 4889842400020000 mov     qword ptr [rsp+200h],rax
00007ffe`6d37676a 488d05f7294800  lea     rax,[coreclr!DispatchEntry::`vftable' (00007ffe`6d7f9168)]
00007ffe`6d376771 4889842400020000 mov     qword ptr [rsp+200h],rax
00007ffe`6d376779 4c89ac2408020000 mov     qword ptr [rsp+208h],r13
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1908
00007ffe`6d376781 488d842400020000 lea     rax,[rsp+200h]
00007ffe`6d376789 48898424a8020000 mov     qword ptr [rsp+2A8h],rax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1909
00007ffe`6d376791 837c244400      cmp     dword ptr [rsp+44h],0
00007ffe`6d376796 0f85f3080000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0x10ef (00007ffe`6d37708f)
00007ffe`6d37679c 4c8d8c2470020000 lea     r9,[rsp+270h]
00007ffe`6d3767a4 488b5c2438      mov     rbx,qword ptr [rsp+38h]
00007ffe`6d3767a9 4c8bc3          mov     r8,rbx
00007ffe`6d3767ac 488b942438050000 mov     rdx,qword ptr [rsp+538h]
00007ffe`6d3767b4 498b8f28050000  mov     rcx,qword ptr [r15+528h]
00007ffe`6d3767bb e8bc740c00      call    coreclr!BucketTable::SetUpProber (00007ffe`6d43dc7c)
00007ffe`6d3767c0 85c0            test    eax,eax
00007ffe`6d3767c2 0f84c7080000    je      coreclr!VirtualCallStubManager::ResolveWorker+0x10ef (00007ffe`6d37708f)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1915
00007ffe`6d3767c8 488d8c2470020000 lea     rcx,[rsp+270h]
00007ffe`6d3767d0 e8abe81000      call    coreclr!Prober::Find (00007ffe`6d485080)
00007ffe`6d3767d5 48898424e8030000 mov     qword ptr [rsp+3E8h],rax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1916
00007ffe`6d3767dd 4885c0          test    rax,rax
00007ffe`6d3767e0 0f85c7080000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0x110d (00007ffe`6d3770ad)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1918
00007ffe`6d3767e6 4c8d7745        lea     r14,[rdi+45h]
00007ffe`6d3767ea 4c89b42410010000 mov     qword ptr [rsp+110h],r14
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1919
00007ffe`6d4f7058 4c8bcb          mov     r9,rbx
00007ffe`6d4f705b 4d8bc6          mov     r8,r14
00007ffe`6d4f705e 498bd5          mov     rdx,r13
00007ffe`6d4f7061 498bcf          mov     rcx,r15
00007ffe`6d4f7064 e85feb1100      call    coreclr!VirtualCallStubManager::GenerateDispatchStubLong (00007ffe`6d615bc8)
00007ffe`6d4f7069 488bf8          mov     rdi,rax
00007ffe`6d4f706c 4889442478      mov     qword ptr [rsp+78h],rax
00007ffe`6d4f7071 4533ed          xor     r13d,r13d
00007ffe`6d4f7074 e9b2f9e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xa8b (00007ffe`6d376a2b)
00007ffe`6d4f7079 b801000000      mov     eax,1
00007ffe`6d4f707e e9daf7e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x8bd (00007ffe`6d37685d)
00007ffe`6d4f7083 488b05a6823d00  mov     rax,qword ptr [coreclr!g_CoreClrCallbacks+0x8 (00007ffe`6d8cf330)]
00007ffe`6d4f708a ff15687b2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f7090 4889842490010000 mov     qword ptr [rsp+190h],rax
00007ffe`6d4f7098 f0093c24        lock or dword ptr [rsp],edi
00007ffe`6d4f709c 488b8c2490010000 mov     rcx,qword ptr [rsp+190h]
00007ffe`6d4f70a4 48890d5d823d00  mov     qword ptr [coreclr!g_pExecutionEngine (00007ffe`6d8cf308)],rcx
00007ffe`6d4f70ab 4c8bb42410010000 mov     r14,qword ptr [rsp+110h]
00007ffe`6d4f70b3 4c8bac2418010000 mov     r13,qword ptr [rsp+118h]
00007ffe`6d4f70bb 488bb42498010000 mov     rsi,qword ptr [rsp+198h]
00007ffe`6d4f70c3 488b9c2428010000 mov     rbx,qword ptr [rsp+128h]
00007ffe`6d4f70cb e9aef7e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x8de (00007ffe`6d37687e)
00007ffe`6d4f70d0 ff15227b2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f70d6 90              nop
00007ffe`6d4f70d7 e9c1f7e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x8fd (00007ffe`6d37689d)
00007ffe`6d4f70dc 488b054d823d00  mov     rax,qword ptr [coreclr!g_CoreClrCallbacks+0x8 (00007ffe`6d8cf330)]
00007ffe`6d4f70e3 ff150f7b2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f70e9 48898424a0010000 mov     qword ptr [rsp+1A0h],rax
00007ffe`6d4f70f1 f0830c2400      lock or dword ptr [rsp],0
00007ffe`6d4f70f6 488b8c24a0010000 mov     rcx,qword ptr [rsp+1A0h]
00007ffe`6d4f70fe 48890d03823d00  mov     qword ptr [coreclr!g_pExecutionEngine (00007ffe`6d8cf308)],rcx
00007ffe`6d4f7105 4c8bb42410010000 mov     r14,qword ptr [rsp+110h]
00007ffe`6d4f710d 4c8bac2418010000 mov     r13,qword ptr [rsp+118h]
00007ffe`6d4f7115 4c8bbc2420010000 mov     r15,qword ptr [rsp+120h]
00007ffe`6d4f711d 488b9c2428010000 mov     rbx,qword ptr [rsp+128h]
00007ffe`6d4f7125 e9ccf7e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x956 (00007ffe`6d3768f6)
00007ffe`6d4f712a ff15c87a2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f7130 90              nop
00007ffe`6d4f7131 e9dff7e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x975 (00007ffe`6d376915)
00007ffe`6d4f7136 33c0            xor     eax,eax
00007ffe`6d4f7138 e91af8e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x9b7 (00007ffe`6d376957)
00007ffe`6d4f713d 4c8bbc2420050000 mov     r15,qword ptr [rsp+520h]
00007ffe`6d4f7145 41c7871005000001000000 mov dword ptr [r15+510h],1
00007ffe`6d4f7150 4c8b4c2438      mov     r9,qword ptr [rsp+38h]
00007ffe`6d4f7155 4d8bc6          mov     r8,r14
00007ffe`6d4f7158 498bd5          mov     rdx,r13
00007ffe`6d4f715b 498bcf          mov     rcx,r15
00007ffe`6d4f715e e88de91100      call    coreclr!VirtualCallStubManager::GenerateDispatchStub (00007ffe`6d615af0)
00007ffe`6d4f7163 488bf8          mov     rdi,rax
00007ffe`6d4f7166 4889442478      mov     qword ptr [rsp+78h],rax
00007ffe`6d4f716b 4533ed          xor     r13d,r13d
00007ffe`6d4f716e e9b8f8e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xa8b (00007ffe`6d376a2b)
00007ffe`6d4f7173 803a0f          cmp     byte ptr [rdx],0Fh
00007ffe`6d4f7176 7507            jne     coreclr!VirtualCallStubManager::ResolveWorker+0x1811df (00007ffe`6d4f717f)
00007ffe`6d4f7178 b801000000      mov     eax,1
00007ffe`6d4f717d eb03            jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x1811e2 (00007ffe`6d4f7182)
00007ffe`6d4f717f 418bc5          mov     eax,r13d
00007ffe`6d4f7182 85c0            test    eax,eax
00007ffe`6d4f7184 740c            je      coreclr!VirtualCallStubManager::ResolveWorker+0x1811f2 (00007ffe`6d4f7192)
00007ffe`6d4f7186 418bc5          mov     eax,r13d
00007ffe`6d4f7189 89842490000000  mov     dword ptr [rsp+90h],eax
00007ffe`6d4f7190 eb0c            jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x1811fe (00007ffe`6d4f719e)
00007ffe`6d4f7192 b801000000      mov     eax,1
00007ffe`6d4f7197 89842490000000  mov     dword ptr [rsp+90h],eax
00007ffe`6d4f719e 85c0            test    eax,eax
00007ffe`6d4f71a0 7505            jne     coreclr!VirtualCallStubManager::ResolveWorker+0x181207 (00007ffe`6d4f71a7)
00007ffe`6d4f71a2 8d5020          lea     edx,[rax+20h]
00007ffe`6d4f71a5 eb05            jmp     coreclr!VirtualCallStubManager::ResolveWorker+0x18120c (00007ffe`6d4f71ac)
00007ffe`6d4f71a7 ba28000000      mov     edx,28h
00007ffe`6d4f71ac 498b8f90040000  mov     rcx,qword ptr [r15+490h]
00007ffe`6d4f71b3 4881c1e0030000  add     rcx,3E0h
00007ffe`6d4f71ba 48898c2418040000 mov     qword ptr [rsp+418h],rcx
00007ffe`6d4f71c2 488b01          mov     rax,qword ptr [rcx]
00007ffe`6d4f71c5 488bbc24d8000000 mov     rdi,qword ptr [rsp+0D8h]
00007ffe`6d4f71cd 4c8d043a        lea     r8,[rdx+rdi]
00007ffe`6d4f71d1 4d8bcf          mov     r9,r15
00007ffe`6d4f71d4 488bd7          mov     rdx,rdi
00007ffe`6d4f71d7 488b00          mov     rax,qword ptr [rax]
00007ffe`6d4f71da ff15187a2900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f71e0 90              nop
00007ffe`6d4f71e1 e931f8e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xa77 (00007ffe`6d376a17)
00007ffe`6d4f71e6 4c8bbc2420050000 mov     r15,qword ptr [rsp+520h]
00007ffe`6d4f71ee e91bfae7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xc6e (00007ffe`6d376c0e)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1921
00007ffe`6d377144 e944f9ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xaed (00007ffe`6d376a8d)
00007ffe`6d377149 e93ff9ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xaed (00007ffe`6d376a8d)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1923
00007ffe`6d376e54 e934fcffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xaed (00007ffe`6d376a8d)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1925
00007ffe`6d3770ad 488bf8          mov     rdi,rax
00007ffe`6d3770b0 4889442478      mov     qword ptr [rsp+78h],rax
00007ffe`6d3770b5 e9d3f9ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xaed (00007ffe`6d376a8d)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1930
00007ffe`6d376a8d 48897c2448      mov     qword ptr [rsp+48h],rdi
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1935
00007ffe`6d37708f c744243003000000 mov     dword ptr [rsp+30h],3
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1940
00007ffe`6d377097 e9f6f9ffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xaf2 (00007ffe`6d376a92)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1942
00007ffe`6d376bff 4c8bbc2420050000 mov     r15,qword ptr [rsp+520h]
00007ffe`6d376c07 41ff8780050000  inc     dword ptr [r15+580h]
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1948
00007ffe`6d376c0e 448ba42440050000 mov     r12d,dword ptr [rsp+540h]
00007ffe`6d376c16 e977feffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xaf2 (00007ffe`6d376a92)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1963
00007ffe`6d376a9b 837c243000      cmp     dword ptr [rsp+30h],0
00007ffe`6d376aa0 0f8512020000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0xd18 (00007ffe`6d376cb8)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1965
00007ffe`6d376aa6 4183fc02        cmp     r12d,2
00007ffe`6d376aaa 0f84f8010000    je      coreclr!VirtualCallStubManager::ResolveWorker+0xd08 (00007ffe`6d376ca8)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1967
00007ffe`6d376ca8 c744243001000000 mov     dword ptr [rsp+30h],1
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1969
00007ffe`6d376cfd e9c0fdffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xb22 (00007ffe`6d376ac2)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1971
00007ffe`6d376dbe c744243002000000 mov     dword ptr [rsp+30h],2
00007ffe`6d376dc6 e9e5feffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xd10 (00007ffe`6d376cb0)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1986
00007ffe`6d376cb0 4c8bbc2420050000 mov     r15,qword ptr [rsp+520h]
00007ffe`6d376cb8 498bcf          mov     rcx,r15
00007ffe`6d376cbb 48898c24b8010000 mov     qword ptr [rsp+1B8h],rcx
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1991
00007ffe`6d376cc3 83bc249400000000 cmp     dword ptr [rsp+94h],0
00007ffe`6d376ccb 0f8507030000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0x1038 (00007ffe`6d376fd8)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1994
00007ffe`6d376fd8 488b8c2480000000 mov     rcx,qword ptr [rsp+80h]
00007ffe`6d376fe0 48898c24b8010000 mov     qword ptr [rsp+1B8h],rcx
00007ffe`6d376fe8 e9e4fcffff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xd31 (00007ffe`6d376cd1)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:1998
00007ffe`6d376cd1 4c8b4c2450      mov     r9,qword ptr [rsp+50h]
00007ffe`6d376cd6 4c8b842438050000 mov     r8,qword ptr [rsp+538h]
00007ffe`6d376cde 488b542438      mov     rdx,qword ptr [rsp+38h]
00007ffe`6d376ce3 e87c6d0c00      call    coreclr!VirtualCallStubManager::GetResolveCacheElem (00007ffe`6d43da64)
00007ffe`6d376ce8 4889842428040000 mov     qword ptr [rsp+428h],rax
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:2006
00007ffe`6d376cf0 448b442430      mov     r8d,dword ptr [rsp+30h]
00007ffe`6d376cf5 488bd0          mov     rdx,rax
00007ffe`6d376cf8 e89f6b0c00      call    coreclr!DispatchCache::Insert (00007ffe`6d43d89c)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:2059
00007ffe`6d376ac2 4183fc01        cmp     r12d,1
00007ffe`6d376ac6 0f85df000000    jne     coreclr!VirtualCallStubManager::ResolveWorker+0xc0b (00007ffe`6d376bab)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:2061
00007ffe`6d4f71f3 ff15ff792900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f71f9 90              nop
00007ffe`6d4f71fa e925f9e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xb84 (00007ffe`6d376b24)
00007ffe`6d4f71ff ff15f3792900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f7205 90              nop
00007ffe`6d4f7206 e94ef9e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xbb9 (00007ffe`6d376b59)
00007ffe`6d4f720b ff15e7792900    call    qword ptr [coreclr!__guard_dispatch_icall_fptr (00007ffe`6d78ebf8)]
00007ffe`6d4f7211 90              nop
00007ffe`6d4f7212 e970f9e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xbe7 (00007ffe`6d376b87)
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:2065
00007ffe`6d4f7217 e8c05df1ff      call    coreclr!UnwindFrameChain (00007ffe`6d40cfdc)
00007ffe`6d4f721c 488b8c24c0000000 mov     rcx,qword ptr [rsp+0C0h]
00007ffe`6d4f7224 e9acf9e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xc35 (00007ffe`6d376bd5)
00007ffe`6d4f7229 488b8424c0000000 mov     rax,qword ptr [rsp+0C0h]
00007ffe`6d4f7231 85d2            test    edx,edx
00007ffe`6d4f7233 742c            je      coreclr!VirtualCallStubManager::ResolveWorker+0x1812c1 (00007ffe`6d4f7261)
00007ffe`6d4f7235 48898424c0010000 mov     qword ptr [rsp+1C0h],rax
00007ffe`6d4f723d c7410c01000000  mov     dword ptr [rcx+0Ch],1
00007ffe`6d4f7244 8b05ee233e00    mov     eax,dword ptr [coreclr!g_TrapReturningThreads (00007ffe`6d8d9638)]
00007ffe`6d4f724a 85c0            test    eax,eax
00007ffe`6d4f724c 740e            je      coreclr!VirtualCallStubManager::ResolveWorker+0x1812bc (00007ffe`6d4f725c)
00007ffe`6d4f724e 488b8c24c0010000 mov     rcx,qword ptr [rsp+1C0h]
00007ffe`6d4f7256 e8f139f7ff      call    coreclr!Thread::RareDisablePreemptiveGC (00007ffe`6d46ac4c)
00007ffe`6d4f725b 90              nop
00007ffe`6d4f725c e986f9e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xc47 (00007ffe`6d376be7)
00007ffe`6d4f7261 48898424c8010000 mov     qword ptr [rsp+1C8h],rax
00007ffe`6d4f7269 4489690c        mov     dword ptr [rcx+0Ch],r13d
00007ffe`6d4f726d 8b4808          mov     ecx,dword ptr [rax+8]
00007ffe`6d4f7270 f6c11f          test    cl,1Fh
00007ffe`6d4f7273 740e            je      coreclr!VirtualCallStubManager::ResolveWorker+0x1812e3 (00007ffe`6d4f7283)
00007ffe`6d4f7275 488b8c24c8010000 mov     rcx,qword ptr [rsp+1C8h]
00007ffe`6d4f727d e81a43fcff      call    coreclr!Thread::RareEnablePreemptiveGC (00007ffe`6d4bb59c)
00007ffe`6d4f7282 90              nop
00007ffe`6d4f7283 e95ff9e7ff      jmp     coreclr!VirtualCallStubManager::ResolveWorker+0xc47 (00007ffe`6d376be7)
00007ffe`6d4f7288 cc              int     3
00007ffe`6d4f7289 cc              int     3
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:2068
00007ffe`6d4d17eb 85db            test    ebx,ebx
00007ffe`6d4d17ed 7412            je      coreclr!`VirtualCallStubManager::ResolveWorker'::`1'::catch$30+0x9e (00007ffe`6d4d1801)
00007ffe`6d4d17ef 488bcf          mov     rcx,rdi
00007ffe`6d4d17f2 e815adfeff      call    coreclr!Exception::Delete (00007ffe`6d4bc50c)
00007ffe`6d4d17f7 c7852802000000000000 mov dword ptr [rbp+228h],0
00007ffe`6d4d1801 488d8d48020000  lea     rcx,[rbp+248h]
00007ffe`6d4d1808 e8afbce9ff      call    coreclr!CLRException::~CLRException (00007ffe`6d36d4bc)
00007ffe`6d4d180d 90              nop
00007ffe`6d4d180e 488d053e59eaff  lea     rax,[coreclr!VirtualCallStubManager::ResolveWorker+0x11b3 (00007ffe`6d377153)]
00007ffe`6d4d1815 4883c430        add     rsp,30h
00007ffe`6d4d1819 5f              pop     rdi
00007ffe`6d4d181a 5d              pop     rbp
00007ffe`6d4d181b 5b              pop     rbx
00007ffe`6d4d181c c3              ret
00007ffe`6d4d181d cc              int     3
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:2073
00007ffe`6d376be7 488b442450      mov     rax,qword ptr [rsp+50h]
>>> e:\a\_work\104\s\src\vm\virtualcallstub.cpp:2074
00007ffe`6d376bec 4881c4e0040000  add     rsp,4E0h
00007ffe`6d376bf3 415f            pop     r15
00007ffe`6d376bf5 415e            pop     r14
00007ffe`6d376bf7 415d            pop     r13
00007ffe`6d376bf9 415c            pop     r12
00007ffe`6d376bfb 5f              pop     rdi
00007ffe`6d376bfc 5e              pop     rsi
00007ffe`6d376bfd 5b              pop     rbx
00007ffe`6d376bfe c3              ret
0:037> !gcinfo 00007ffe6d37600c
00007ffe6d37600c is not a valid MethodDesc
0:037> !ehinfo 00007ffe6d37600c 
0000000000000000 is not a MethodDesc
0:037> !ip2md 00007ffe6d37600c
Failed to request MethodData, not in JIT code range
janvorli commented 5 years ago

@arekpalinski I am sorry, the addresses that I've used in the windbg commands were from the stack trace of the previous failure. I have thought you still have it open.

In all the commands, the 00007ffe6d37600c needs to be replaced by the value from the IP column of the stack trace from the !clrstack command for the frame of [StubDispatchFrame:xxxxxxxxxxxxxxxxx].

Also, to get the stress log too, the env var COMPlus_StressLog=1 needs to be set too. I've realized later that I have not put it into the list of variables, so I've added a comment, but it probably got hidden in the comments.

Also, please share the output of the !u command as it will be very long.

And one more last thing - can you please add -ehinfo option to the !u command too? So it would be !u -gcinfo -ehinfo address_from_the_frame

Since you will have the stress log enabled, I would also need the dump of the stress log. You'll get it using !dumplog file_path_and_filename_where_to_store_the_log.txt. It will likely take quite a long time during which windbg will seem to be hung. The resulting file is going to be huge, but all I care about is the beginning of it until certain point. You'll see something like Starting scan of Thread 13E320F0 ID = 123 close to the log file start. Search for the third occurence of the same string and then cut off everything after that. Again, it is going to be large, so gist is the best way to share it.