microsoft / dotnet

This repo is the official home of .NET on GitHub. It's a great starting point to find many .NET OSS projects from Microsoft and the community, including many that are part of the .NET Foundation.
https://devblogs.microsoft.com/dotnet/
MIT License
14.35k stars 2.21k forks source link

Application crashes in SerialStream.cs #1080

Open EK2017 opened 5 years ago

EK2017 commented 5 years ago

My application crashes in clr.dll. the crashes are totally random ranging from days to months. Call stacks from the crash dumps show that the crashes occur in different parts of the code, but several times it crashed in SerialStream.cs BeginReadCore(). After reviewing SerialStream.cs, I've noticed that Overlapped, NativeOverlapped and IOCallback are not pinned. I may be missing something, but should these be pinned since they are passed to a native code?

danmoseley commented 5 years ago

cc @krwq

krwq commented 5 years ago

@EK2017 do you perhaps have couple of full stack traces/dumps from such crashes? It would be useful to determine if this is a single or multiple issues.

Do you also have some example (min repro) which shows the problem?

What type of device are you using? (i.e. USB serial, physical serial, UART)

This has presumably reproed on full framework - did you try running this on .NET Core and see if it repros? (we also can run on Linux so if that is an option it would be useful if we can determine if this is scoped to just Windows or also Linux)

EK2017 commented 5 years ago

Thanks a lot for getting back to me. I can provide the crash dumps, do you have a drop box where I can upload them? If not I can set something up. As far as the repro, we are unable to reproduce it in a simulated environment. The crash only happens on our systems in the field. The serial device is UART. We are using Modbus RTU over RS485 to communicate between devices. We are using the full framework with Windows 10 Enterprise LTSB. Our application is written for Windows so we can't easily try running it on Linux. We have not tried using .NET core, but I can look into that. Thanks

krwq commented 5 years ago

@EK2017 let's start with just stack traces from those crash dumps (crash dump can potentially contain user data and there are some legal consequences if I decide to download them). Not sure what debugger do you use but would be useful to get both native and managed stacks (for managed if using i.e. windbg you can use sos!CLRStack or just through UI if using VS) for all related threads.

Testing .NET Core would be useful for us (especially it may take some time to check if it repros). Although I expect the code to be mostly the same it would make it easier to exclude some other possible unrelated issues which we might have not fixed on full framework (i.e. JIT, GC)

krwq commented 5 years ago

@EK2017 also could you please share the answer for this qustion:

What type of device are you using? (i.e. USB serial, physical serial, UART)

if you can see the problem with all kinds of devices please share that as it may be useful info. If not sure you can send a link to where you got it from or a picture.

EK2017 commented 5 years ago

Here are the stack traces:

0:022> !clrstack
OS Thread Id: 0xbec (22)
Child SP       IP Call Site
1d36ed84 720c9ab5 [HelperMethodFrame: 1d36ed84] 
1d36edf8 711977ba DomainNeutralILStubClass.IL_STUB_PInvoke(SECURITY_ATTRIBUTES, Boolean, Boolean, System.String)
1d36edfc 7114bb53 [InlinedCallFrame: 1d36edfc] Microsoft.Win32.Win32Native.CreateEvent(SECURITY_ATTRIBUTES, Boolean, Boolean, System.String)
1d36ee5c 7114bb53 System.Threading.EventWaitHandle..ctor(Boolean, System.Threading.EventResetMode, System.String) [f:\dd\ndp\clr\src\BCL\system\threading\eventwaithandle.cs @ 69]
1d36ee7c 7114baf4 System.Threading.ManualResetEvent..ctor(Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\manualresetevent.cs @ 27]
1d36ee80 709f9a8b System.IO.Ports.SerialStream.BeginReadCore(Byte[], Int32, Int32, System.AsyncCallback, System.Object) [f:\dd\NDP\fx\src\sys\system\io\ports\SerialStream.cs @ 1437]
1d36eeac 709f940b System.IO.Ports.SerialStream.Read(Byte[], Int32, Int32, Int32) [f:\dd\NDP\fx\src\sys\system\io\ports\SerialStream.cs @ 1102]
1d36eed0 709f9397 System.IO.Ports.SerialStream.Read(Byte[], Int32, Int32) [f:\dd\NDP\fx\src\sys\system\io\ports\SerialStream.cs @ 1078]
1d36eee8 709f6cee System.IO.Ports.SerialPort.Read(Byte[], Int32, Int32) [f:\dd\NDP\fx\src\sys\system\io\ports\SerialPort.cs @ 890]
1d36ef0c 1aaab3e0 WSMBS.CTxRx.TxRxRTU(Byte[], Int32, Byte[], Int32)
1d36ef48 1aaab07e WSMBS.CTxRx.TxRx(Byte[], Int32, Byte[], Int32)
1d36ef68 1aaaaeea WSMBS.CModbus.ReadRegisters(Byte, UInt16, UInt16, UInt16, Int16[], Int32)
1d36efa0 1aaaada8 WSMBS.WSMBSControl.ReadHoldingRegisters(Byte, UInt16, UInt16, Int16[])
1d36efb8 1aaaab8a ModbusLibrary.ModbusSerialCommunicator.ReadHoldingRegistersAsBytes(Byte, UInt16, UInt16)
1d36f048 1aaa9028 ModbusLibrary.ModbusDecoder.ReadSpecifiedTypes(ModbusLibrary.IModbusCommunicator, Byte, System.Collections.Generic.IEnumerable`1, Boolean, System.Exception ByRef)
1d36f2d8 1aaa4577 ModbusLibrary.ModbusGenericDevice.QueryParameterGroup(System.Collections.Generic.IEnumerable`1, Byte, Boolean)
1d36f3bc 1aaa3cd9 Application.AbstractModbusDevice.QueryModbusParameters()
1d36f444 1aaa383e Application.AbstractModbusDevice.timer_tick()
1d36f454 18e90a5e Application.AbstractModbusDevice.threaded_timer_tick(System.Object)
1d36f46c 710d3e51 System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\timer.cs @ 722]
1d36f470 7116bcd5 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]
1d36f4dc 7116bbe6 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]
1d36f4f0 710d3d57 System.Threading.TimerQueueTimer.CallCallback() [f:\dd\ndp\clr\src\BCL\system\threading\timer.cs @ 705]
1d36f524 710d3bde System.Threading.TimerQueueTimer.Fire() [f:\dd\ndp\clr\src\BCL\system\threading\timer.cs @ 662]
1d36f564 710d3e8a System.Threading.TimerQueue.FireQueuedTimerCompletion(System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\timer.cs @ 436]
1d36f568 71149063 System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() [f:\dd\ndp\clr\src\BCL\system\threading\threadpool.cs @ 1252]
1d36f57c 711487f2 System.Threading.ThreadPoolWorkQueue.Dispatch() [f:\dd\ndp\clr\src\BCL\system\threading\threadpool.cs @ 820]
1d36f5cc 7114865a System.Threading._ThreadPoolWaitCallback.PerformWaitCallback() [f:\dd\ndp\clr\src\BCL\system\threading\threadpool.cs @ 1161]
1d36f7f0 720beb16 [DebuggerU2MCatchHandlerFrame: 1d36f7f0] 

0:022> !dumpstack
OS Thread Id: 0xbec (22)
Current frame: clr!ObjHeader::PassiveGetSyncBlock+0x1f
ChildEBP RetAddr  Caller, Callee
1d36ecb4 720c9aec clr!ObjHeader::GetSyncBlock+0x33, calling clr!ObjHeader::PassiveGetSyncBlock
1d36ece0 721f1c7a clr!WKS::GCHeap::Alloc+0x94, calling clr!WKS::CFinalize::RegisterForFinalization
1d36ed00 724045c6 clr!ObjHeader::SetAppDomainIndex+0x66, calling clr!ObjHeader::GetSyncBlock
1d36ed14 7210c546 clr!Object::SetAppDomain+0x26, calling clr!ObjHeader::SetAppDomainIndex
1d36ed24 7210c501 clr!AllocateObject+0xea, calling clr!Object::SetAppDomain
1d36ed60 720bf34f clr!HelperMethodFrame::Push+0x10, calling clr!GetThread
1d36ed68 720c7d95 clr!JIT_New+0x6b, calling clr!AllocateObject
1d36edb4 72179ab7 clr!DestroyAsyncPinningHandle+0x1b, calling clr!HndDestroyHandle
1d36edc8 720c7d65 clr!JIT_New+0x25, calling clr!LazyMachStateCaptureState
1d36edd4 72179b8a clr!FreeNativeOverlapped+0xc3, calling clr!_EH_epilog3
1d36edf0 711977ba (MethodDesc 70e0b3bc +0x66 DomainNeutralILStubClass.IL_STUB_PInvoke(SECURITY_ATTRIBUTES, Boolean, Boolean, System.String)), calling clr!JIT_New
1d36ee00 7111515b (MethodDesc 70ddfc54 +0x5b System.Collections.Concurrent.ConcurrentStack`1[[System.__Canon, mscorlib]].Push(System.__Canon)), calling clr!COMInterlocked::CompareExchangeObject
1d36ee18 711045ee (MethodDesc 70eced28 +0x7e System.Threading.PinnableBufferCache.Free(System.Object)), calling (MethodDesc 70ddfc54 +0 System.Collections.Concurrent.ConcurrentStack`1[[System.__Canon, mscorlib]].Push(System.__Canon))
1d36ee4c 7114bb53 (MethodDesc 70df1d34 +0x53 System.Threading.EventWaitHandle..ctor(Boolean, System.Threading.EventResetMode, System.String)), calling 7103dec8
1d36ee6c 7114baf4 (MethodDesc 70df782c +0x14 System.Threading.ManualResetEvent..ctor(Boolean)), calling (MethodDesc 70df1d34 +0 System.Threading.EventWaitHandle..ctor(Boolean, System.Threading.EventResetMode, System.String))
1d36ee78 709f9a8b (MethodDesc 703cc134 +0x4f System.IO.Ports.SerialStream.BeginReadCore(Byte[], Int32, Int32, System.AsyncCallback, System.Object)), calling 71043534
1d36ee94 709f940b (MethodDesc 703cc0ac +0x6b System.IO.Ports.SerialStream.Read(Byte[], Int32, Int32, Int32)), calling (MethodDesc 703cc134 +0 System.IO.Ports.SerialStream.BeginReadCore(Byte[], Int32, Int32, System.AsyncCallback, System.Object))
1d36eebc 709f9397 (MethodDesc 703cc0a4 +0x23 System.IO.Ports.SerialStream.Read(Byte[], Int32, Int32)), calling (MethodDesc 703cc0ac +0 System.IO.Ports.SerialStream.Read(Byte[], Int32, Int32, Int32))
1d36eed8 709f6cee (MethodDesc 703b019c +0xe6 System.IO.Ports.SerialPort.Read(Byte[], Int32, Int32))
1d36eefc 1aaab3e0 (MethodDesc 1797e628 +0x318 WSMBS.CTxRx.TxRxRTU(Byte[], Int32, Byte[], Int32)), calling 704606fc
1d36ef34 1aaab07e (MethodDesc 1797e61c +0xb6 WSMBS.CTxRx.TxRx(Byte[], Int32, Byte[], Int32)), calling (MethodDesc 1797e628 +0 WSMBS.CTxRx.TxRxRTU(Byte[], Int32, Byte[], Int32))
1d36ef54 1aaaaeea (MethodDesc 1797e6c4 +0x12a WSMBS.CModbus.ReadRegisters(Byte, UInt16, UInt16, UInt16, Int16[], Int32)), calling (MethodDesc 1797e61c +0 WSMBS.CTxRx.TxRx(Byte[], Int32, Byte[], Int32))
1d36ef84 1aaaada8 (MethodDesc 1797df14 +0x28 WSMBS.WSMBSControl.ReadHoldingRegisters(Byte, UInt16, UInt16, Int16[])), calling (MethodDesc 1797e6c4 +0 WSMBS.CModbus.ReadRegisters(Byte, UInt16, UInt16, UInt16, Int16[], Int32))
1d36efa4 1aaaab8a (MethodDesc 1797c2a4 +0xaa ModbusLibrary.ModbusSerialCommunicator.ReadHoldingRegistersAsBytes(Byte, UInt16, UInt16)), calling 1aaaa0a4
1d36f038 1aaa9028 (MethodDesc 19786654 +0x440 ModbusLibrary.ModbusDecoder.ReadSpecifiedTypes(ModbusLibrary.IModbusCommunicator, Byte, System.Collections.Generic.IEnumerable`1<ModbusLibrary.ConfigurationMember>, Boolean, System.Exception ByRef)), calling 1979079a
1d36f2b4 1aaa7d9c (MethodDesc 19786eb8 +0x54 System.Linq.GroupedEnumerable`3[[System.__Canon, mscorlib],[ModbusLibrary.ModbusGenericReadType, ModbusLibrary],[System.__Canon, mscorlib]].GetEnumerator()), calling (MethodDesc 19787d6c +0 System.Linq.Lookup`2[[ModbusLibrary.ModbusGenericReadType, ModbusLibrary],[System.__Canon, mscorlib]].GetEnumerator())
1d36f2c4 1aaa4577 (MethodDesc 1797c8bc +0x29f ModbusLibrary.ModbusGenericDevice.QueryParameterGroup(System.Collections.Generic.IEnumerable`1<System.String>, Byte, Boolean)), calling (MethodDesc 19786654 +0 ModbusLibrary.ModbusDecoder.ReadSpecifiedTypes(ModbusLibrary.IModbusCommunicator, Byte, System.Collections.Generic.IEnumerable`1<ModbusLibrary.ConfigurationMember>, Boolean, System.Exception ByRef))
1d36f3ac 1aaa3cd9 (MethodDesc 15cc04f4 +0xf9 Application.AbstractModbusDevice.QueryModbusParameters()), calling 05029c86
1d36f43c 1aaa383e (MethodDesc 15cc04dc +0x86 Application.AbstractModbusDevice.timer_tick())
1d36f44c 18e90a5e (MethodDesc 15cc04c4 +0x8e Application.AbstractModbusDevice.threaded_timer_tick(System.Object)), calling (MethodDesc 15cc04dc +0 Application.AbstractModbusDevice.timer_tick())
1d36f464 710d3e51 (MethodDesc 70ecf840 +0x31 System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object))
1d36f468 7116bcd5 (MethodDesc 70df1f14 +0xe5 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean))
1d36f4cc 7116bbe6 (MethodDesc 70df1f08 +0x16 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)), calling (MethodDesc 70df1f14 +0 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean))
1d36f4e0 710d3d57 (MethodDesc 70de03b0 +0xa7 System.Threading.TimerQueueTimer.CallCallback()), calling (MethodDesc 70df1f08 +0 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean))
1d36f51c 710d3bde (MethodDesc 70ecf828 +0xbe System.Threading.TimerQueueTimer.Fire()), calling (MethodDesc 70de03b0 +0 System.Threading.TimerQueueTimer.CallCallback())
1d36f55c 710d3e8a (MethodDesc 70ecf900 +0x2a System.Threading.TimerQueue.FireQueuedTimerCompletion(System.Object)), calling (MethodDesc 70ecf828 +0 System.Threading.TimerQueueTimer.Fire())
1d36f560 71149063 (MethodDesc 70de0330 +0x33 System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem())
1d36f574 711487f2 (MethodDesc 70ecf618 +0x192 System.Threading.ThreadPoolWorkQueue.Dispatch()), calling 05037562
1d36f5b4 710d3951 (MethodDesc 70de03c8 +0x21 System.Threading.TimerQueue.AppDomainTimerCallback()), calling (MethodDesc 70ecf8e8 +0 System.Threading.TimerQueue.FireNextTimers())
1d36f5c4 7114865a (MethodDesc 70de0350 +0xa System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()), calling (MethodDesc 70ecf618 +0 System.Threading.ThreadPoolWorkQueue.Dispatch())
1d36f5c8 720beb16 clr!CallDescrWorkerInternal+0x34
1d36f5d4 720c6e84 clr!CallDescrWorkerWithHandler+0x6b, calling clr!CallDescrWorkerInternal
1d36f5e8 720c6e3d clr!CallDescrWorkerWithHandler+0x20, calling clr!_alloca_probe
1d36f628 720c82f4 clr!MethodDescCallSite::CallTargetWorker+0x16a, calling clr!CallDescrWorkerWithHandler
1d36f63c 720c8478 clr!ArgIteratorTemplate<ArgIteratorBase>::ComputeReturnFlags+0x1b, calling clr!MetaSig::GetReturnTypeNormalized
1d36f650 720c8284 clr!MethodDescCallSite::CallTargetWorker+0x87, calling clr!_alloca_probe_16
1d36f680 720c86f1 clr!MethodDescCallSite::MethodDescCallSite+0x50, calling clr!ArgIteratorTemplate<ArgIteratorBase>::ForceSigWalk
1d36f69c 72261273 clr!QueueUserWorkItemManagedCallback+0x23, calling clr!MethodDescCallSite::CallTargetWorker
1d36f71c 7225fdca clr!ManagedThreadBase_DispatchInner+0x71
1d36f730 7225fe34 clr!ManagedThreadBase_DispatchMiddle+0x7e, calling clr!ManagedThreadBase_DispatchInner
1d36f758 72260e76 clr!HillClimbing::Update+0x3a7, calling clr!HillClimbing::GetWaveComponent
1d36f78c 721ebc2b clr!HillClimbing::Update+0x783, calling clr!_ftol2_sse
1d36f7b8 720c1610 clr!Frame::Pop+0x8, calling clr!GetThread
1d36f7d4 7225ff01 clr!ManagedThreadBase_DispatchOuter+0x5b, calling clr!ManagedThreadBase_DispatchMiddle
1d36f804 7699ac89 KERNELBASE!WaitForSingleObjectEx+0x99, calling ntdll!NtWaitForSingleObject
1d36f830 7225ff6f clr!ManagedThreadBase_FullTransitionWithAD+0x2f, calling clr!ManagedThreadBase_DispatchOuter
1d36f854 72261201 clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0x102, calling clr!ManagedThreadBase_FullTransitionWithAD
1d36f878 7226034d clr!CLRSemaphore::Wait+0xc0, calling KERNELBASE!WaitForSingleObjectEx
1d36f884 72260388 clr!CLRSemaphore::Wait+0x172, calling clr!_EH_epilog3
1d36f904 722602dd clr!ThreadpoolMgr::ExecuteWorkRequest+0x4f
1d36f91c 722600b9 clr!ThreadpoolMgr::WorkerThreadStart+0x3d3, calling clr!ThreadpoolMgr::ExecuteWorkRequest
1d36f94c 720c963b clr!EEHeapFree+0x3b, calling kernel32!HeapFreeStub
1d36f984 7210b601 clr!Thread::intermediateThreadProc+0x55
1d36fa14 7210b5e7 clr!Thread::intermediateThreadProc+0x3b, calling clr!_alloca_probe_16
1d36fa28 767f62c4 kernel32!BaseThreadInitThunk+0x24
1d36fa3c 77440609 ntdll!__RtlUserThreadStart+0x2f
1d36fa84 774405d4 ntdll!_RtlUserThreadStart+0x1b, calling ntdll!__RtlUserThreadStart
EK2017 commented 5 years ago

We are using physical serial (COM port)

krwq commented 5 years ago

Thanks @EK2017, could you also show the exception as well https://stackoverflow.com/questions/7304376/windbg-finding-the-actual-unmanaged-exception and !PrintException

EK2017 commented 5 years ago
0:022> !printexception
Exception object: 08f91108
Exception type:   System.ExecutionEngineException
Message:          <none>
InnerException:   <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80131506
EK2017 commented 5 years ago
0:022> !analyze -v
*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************

*** ERROR: Symbol file could not be found.  Defaulted to export symbols for mbusslave.net.dll - 
Failed to request MethodData, not in JIT code range
MethodDesc:   1797e628
Method Name:  WSMBS.CTxRx.TxRxRTU(Byte[], Int32, Byte[], Int32)
Class:        15b61944
MethodTable:  1797e64c
mdToken:      0600002a
Module:       1797d8d0
IsJitted:     yes
CodeAddr:     1aaab0c8
Transparency: Safe critical
MethodDesc:   1797e61c
Method Name:  WSMBS.CTxRx.TxRx(Byte[], Int32, Byte[], Int32)
Class:        15b61944
MethodTable:  1797e64c
mdToken:      06000029
Module:       1797d8d0
IsJitted:     yes
CodeAddr:     1aaaafc8
Transparency: Safe critical
MethodDesc:   1797e6c4
Method Name:  WSMBS.CModbus.ReadRegisters(Byte, UInt16, UInt16, UInt16, Int16[], Int32)
Class:        15b61998
MethodTable:  1797e730
mdToken:      06000017
Module:       1797d8d0
IsJitted:     yes
CodeAddr:     1aaaadc0
Transparency: Safe critical
MethodDesc:   1797df14
Method Name:  WSMBS.WSMBSControl.ReadHoldingRegisters(Byte, UInt16, UInt16, Int16[])
Class:        15b61320
MethodTable:  1797e2d4
mdToken:      06000047
Module:       1797d8d0
IsJitted:     yes
CodeAddr:     1aaaad80
Transparency: Safe critical
MethodDesc:   1797c2a4
Method Name:  ModbusLibrary.ModbusSerialCommunicator.ReadHoldingRegistersAsBytes(Byte, UInt16, UInt16)
Class:        15cbdb4c
MethodTable:  1797c328
mdToken:      0600015d
Module:       1797a4f4
IsJitted:     yes
CodeAddr:     1aaaaae0
Transparency: Critical
MethodDesc:   19786654
Method Name:  ModbusLibrary.ModbusDecoder.ReadSpecifiedTypes(ModbusLibrary.IModbusCommunicator, Byte, System.Collections.Generic.IEnumerable`1<ModbusLibrary.ConfigurationMember>, Boolean, System.Exception ByRef)
Class:        197a5fd4
MethodTable:  197866ec
mdToken:      06000134
Module:       1797a4f4
IsJitted:     yes
CodeAddr:     1aaa8be8
Transparency: Critical
MethodDesc:   1797c8bc
Method Name:  ModbusLibrary.ModbusGenericDevice.QueryParameterGroup(System.Collections.Generic.IEnumerable`1<System.String>, Byte, Boolean)
Class:        15cbdd80
MethodTable:  1797c924
mdToken:      0600014c
Module:       1797a4f4
IsJitted:     yes
CodeAddr:     1aaa42d8
Transparency: Critical
MethodDesc:   15cc04f4
Method Name:  Application.AbstractModbusDevice.QueryModbusParameters()
Class:        15cb2bc4
MethodTable:  15cc054c
mdToken:      06000266
Module:       0501400c
IsJitted:     yes
CodeAddr:     1aaa3be0
Transparency: Critical
MethodDesc:   15cc04dc
Method Name:  Application.AbstractModbusDevice.timer_tick()
Class:        15cb2bc4
MethodTable:  15cc054c
mdToken:      06000264
Module:       0501400c
IsJitted:     yes
CodeAddr:     1aaa37b8
Transparency: Critical
MethodDesc:   15cc04c4
Method Name:  Application.AbstractModbusDevice.threaded_timer_tick(System.Object)
Class:        15cb2bc4
MethodTable:  15cc054c
mdToken:      06000262
Module:       0501400c
IsJitted:     yes
CodeAddr:     18e909d0
Transparency: Critical
GetUrlPageData2 (WinHttp) failed: 12002.

DUMP_CLASS: 2

DUMP_QUALIFIER: 400

CONTEXT:  (.ecxr)
eax=1483c060 ebx=2ab36fc8 ecx=0152a38c edx=18ae0118 esi=2ab36fc8 edi=00000000
eip=720c9ab5 esp=1d36ecb8 ebp=1d36ed00 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
clr!ObjHeader::PassiveGetSyncBlock+0x1f:
720c9ab5 8b04c8          mov     eax,dword ptr [eax+ecx*8] ds:002b:1f18dcc0=????????
Resetting default scope

FAULTING_IP: 
clr!ObjHeader::PassiveGetSyncBlock+1f
720c9ab5 8b04c8          mov     eax,dword ptr [eax+ecx*8]

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 720c9ab5 (clr!ObjHeader::PassiveGetSyncBlock+0x0000001f)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000001
NumberParameters: 2
   Parameter[0]: 00000000
   Parameter[1]: 1f18dcc0
Attempt to read from address 1f18dcc0

PROCESS_NAME:  application.exe

ORIGINAL_CAB_PATH:  D:\Support\2968.zip

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_CODE_STR:  c0000005

EXCEPTION_PARAMETER1:  00000000

EXCEPTION_PARAMETER2:  1f18dcc0

FOLLOWUP_IP: 
clr!ObjHeader::PassiveGetSyncBlock+1f
720c9ab5 8b04c8          mov     eax,dword ptr [eax+ecx*8]

READ_ADDRESS:  1f18dcc0 

WATSON_BKT_PROCSTAMP:  5c1be45d

WATSON_BKT_PROCVER:  2014.0.0.0

PROCESS_VER_PRODUCT:  Product

WATSON_BKT_MODULE:  clr.dll

WATSON_BKT_MODSTAMP:  59d413ce

WATSON_BKT_MODOFFSET:  19ab5

WATSON_BKT_MODVER:  4.7.2558.0

MODULE_VER_PRODUCT:  Microsoft® .NET Framework

BUILD_VERSION_STRING:  10.0.14393.0 (rs1_release.160715-1616)

MODLIST_WITH_TSCHKSUM_HASH:  96791a1aab0ae50f210eb10516a7ac41bfba777f

MODLIST_SHA1_HASH:  d565a0793e3e4bc666f80b444a96cfa844e21798

NTGLOBALFLAG:  2000000

PROCESS_BAM_CURRENT_THROTTLED: 0

PROCESS_BAM_PREVIOUS_THROTTLED: 0

APPLICATION_VERIFIER_FLAGS:  0

PRODUCT_TYPE:  1

SUITE_MASK:  272

DUMP_FLAGS:  8000c07

DUMP_TYPE:  3

APPLICATION_VERIFIER_LOADED: 1

MISSING_CLR_SYMBOL: 0

ANALYSIS_SESSION_HOST:  1228

ANALYSIS_SESSION_TIME:  07-16-2019 14:49:50.0893

ANALYSIS_VERSION: 10.0.16299.15 x86fre

MANAGED_CODE: 1

MANAGED_ENGINE_MODULE:  clr

MANAGED_ANALYSIS_PROVIDER:  SOS

MANAGED_THREAD_ID: bec

MANAGED_EXCEPTION_ADDRESS: 8f91108

THREAD_ATTRIBUTES: 
OS_LOCALE:  ENU

ADDITIONAL_DEBUG_TEXT:  SOS.DLL is not loaded for managed code. Analysis might be incomplete

PROBLEM_CLASSES: 

    ID:     [0n301]
    Type:   [@ACCESS_VIOLATION]
    Class:  Addendum
    Scope:  BUCKET_ID
    Name:   Omit
    Data:   Omit
    PID:    [Unspecified]
    TID:    [0xbec]
    Frame:  [0] : clr!ObjHeader::PassiveGetSyncBlock

    ID:     [0n273]
    Type:   [INVALID_POINTER_READ]
    Class:  Primary
    Scope:  DEFAULT_BUCKET_ID (Failure Bucket ID prefix)
            BUCKET_ID
    Name:   Add
    Data:   Omit
    PID:    [Unspecified]
    TID:    [0xbec]
    Frame:  [0] : clr!ObjHeader::PassiveGetSyncBlock

    ID:     [0n92]
    Type:   [AVRF]
    Class:  Addendum
    Scope:  DEFAULT_BUCKET_ID (Failure Bucket ID prefix)
            BUCKET_ID
    Name:   Add
    Data:   Omit
    PID:    [0xb98]
    TID:    [0xbec]
    Frame:  [0] : clr!ObjHeader::PassiveGetSyncBlock

    ID:     [0n239]
    Type:   [NOSOS]
    Class:  Addendum
    Scope:  DEFAULT_BUCKET_ID (Failure Bucket ID prefix)
            BUCKET_ID
    Name:   Add
    Data:   Omit
    PID:    [Unspecified]
    TID:    [Unspecified]
    Frame:  [0]

BUGCHECK_STR:  APPLICATION_FAULT_INVALID_POINTER_READ_NOSOS_AVRF

DEFAULT_BUCKET_ID:  INVALID_POINTER_READ_NOSOS_AVRF

PRIMARY_PROBLEM_CLASS:  APPLICATION_FAULT

LAST_CONTROL_TRANSFER:  from 720c9aec to 720c9ab5

STACK_TEXT:  
1d36ecb4 720c9aec 3ce770e0 00000000 2ab36fc8 clr!ObjHeader::PassiveGetSyncBlock+0x1f
1d36ed00 724045c6 2ab36fcc 00000014 712346b4 clr!ObjHeader::GetSyncBlock+0x33
1d36ed14 7210c546 00000001 2ab36fcc 1d36ed68 clr!ObjHeader::SetAppDomainIndex+0x66
1d36ed24 7210c501 088f66b8 712346b4 0e9ddac0 clr!Object::SetAppDomain+0x26
1d36ed68 720c7d95 3ce771d0 00000000 0e9ddac0 clr!AllocateObject+0xea
1d36edf0 711977ba 21d045f0 720bf9f8 1d36f7f0 clr!JIT_New+0x6b
1d36ee4c 7114bb53 00000000 00000000 70df782c mscorlib_ni+0x4777ba
1d36ee6c 7114baf4 00000000 00000001 709f9a8b mscorlib_ni+0x42bb53
1d36ee94 709f940b 00000000 00000000 00000046 mscorlib_ni+0x42baf4
1d36eebc 709f9397 000003e8 00000046 00000019 System_ni+0x6e940b
1d36eed8 709f6cee 00000046 00000019 00000000 System_ni+0x6e9397
1d36eefc 1aaab3e0 00000046 00000019 0952dca4 System_ni+0x6e6cee
WARNING: Frame IP not in any known module. Following frames may be wrong.
1d36ef34 1aaab07e 0000005d 2ab3697c 00000006 0x1aaab3e0
1d36ef54 1aaaaeea 0000005d 2ab3697c 00000006 0x1aaab07e
1d36ef84 1aaaada8 00000000 2ab36900 0000002d 0x1aaaaeea
1d36efa4 1aaaab8a 2ab36900 0000002d 00000bb8 0x1aaaada8
1d36f038 1aaa9028 0000002d 00000bb8 1d36f074 0x1aaaab8a
1d36f2c4 1aaa4577 1d36f38c 00000001 2ab21338 0x1aaa9028
1d36f3ac 1aaa3cd9 00000001 00000005 0952e448 0x1aaa4577
1d36f43c 1aaa383e 090e9134 00000000 1d36f464 0x1aaa3cd9
1d36f44c 18e90a5e 090e9134 00000000 00000000 0x1aaa383e
1d36f464 710d3e51 7116bcd5 09898b08 00000000 0x18e90a5e
1d36f4cc 7116bbe6 00000001 0952ed6c 00000000 mscorlib_ni+0x3b3e51
1d36f4e0 710d3d57 00000001 0952ed6c 3ce76904 mscorlib_ni+0x44bbe6
1d36f51c 710d3bde 00000000 0903608c 0952ed6c mscorlib_ni+0x3b3d57
1d36f55c 710d3e8a 71149063 00000000 00000000 mscorlib_ni+0x3b3bde
1d36f574 711487f2 00000000 00000000 0989894c mscorlib_ni+0x3b3e8a
1d36f5c4 7114865a 720beb16 0e9ddac0 1d36f628 mscorlib_ni+0x4287f2
1d36f5d4 720c6e84 1d36f670 1d36f618 72257020 mscorlib_ni+0x42865a
1d36f628 720c82f4 1d36f6b4 1d36f654 00000004 clr!CallDescrWorkerWithHandler+0x6b
1d36f69c 72261273 00000000 70de0350 71148650 clr!MethodDescCallSite::CallTargetWorker+0x16a
1d36f71c 7225fdca 1d36f937 0e9ddac0 1d36f838 clr!QueueUserWorkItemManagedCallback+0x23
1d36f730 7225fe34 3ce76bf4 1d36f838 00000000 clr!ManagedThreadBase_DispatchInner+0x71
1d36f7d4 7225ff01 3ce76410 00000001 0e9ddac0 clr!ManagedThreadBase_DispatchMiddle+0x7e
1d36f830 7225ff6f 00000001 00000000 00000001 clr!ManagedThreadBase_DispatchOuter+0x5b
1d36f854 72261201 00000001 00000004 3ce76524 clr!ManagedThreadBase_FullTransitionWithAD+0x2f
1d36f904 722602dd 1d36f935 1d36f937 0e9ddac0 clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0x102
1d36f91c 722600b9 3ce76508 7225ff90 00000000 clr!ThreadpoolMgr::ExecuteWorkRequest+0x4f
1d36f984 7210b601 00000000 00000000 00000000 clr!ThreadpoolMgr::WorkerThreadStart+0x3d3
1d36fa28 767f62c4 0fafdff8 767f62a0 6590bf1c clr!Thread::intermediateThreadProc+0x55
1d36fa3c 77440609 0fafdff8 111b5c94 00000000 kernel32!BaseThreadInitThunk+0x24
1d36fa84 774405d4 ffffffff 77462545 00000000 ntdll!__RtlUserThreadStart+0x2f
1d36fa94 00000000 7210b5b0 0fafdff8 00000000 ntdll!_RtlUserThreadStart+0x1b

THREAD_SHA1_HASH_MOD_FUNC:  1dda0f85b1070bfbd77c306e7bca6f4a1afb046a

THREAD_SHA1_HASH_MOD_FUNC_OFFSET:  fd670f3beb2f0a56a0cd3f1b85fc1cdd32ce1ef2

THREAD_SHA1_HASH_MOD:  7c5efdb9c45633ee9ed5814152d695af0918f37b

FAULT_INSTR_CODE:  c3c8048b

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  clr!ObjHeader::PassiveGetSyncBlock+1f

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: clr

IMAGE_NAME:  clr.dll

DEBUG_FLR_IMAGE_TIMESTAMP:  59d413ce

STACK_COMMAND:  ~22s ; .ecxr ; kb

FAILURE_BUCKET_ID:  INVALID_POINTER_READ_NOSOS_AVRF_c0000005_clr.dll!ObjHeader::PassiveGetSyncBlock

BUCKET_ID:  APPLICATION_FAULT_INVALID_POINTER_READ_NOSOS_AVRF_clr!ObjHeader::PassiveGetSyncBlock+1f

FAILURE_EXCEPTION_CODE:  c0000005

FAILURE_IMAGE_NAME:  clr.dll

BUCKET_ID_IMAGE_STR:  clr.dll

FAILURE_MODULE_NAME:  clr

BUCKET_ID_MODULE_STR:  clr

FAILURE_FUNCTION_NAME:  ObjHeader::PassiveGetSyncBlock

BUCKET_ID_FUNCTION_STR:  ObjHeader::PassiveGetSyncBlock

BUCKET_ID_OFFSET:  1f

BUCKET_ID_MODPRIVATE: 1

BUCKET_ID_MODTIMEDATESTAMP:  59d413ce

BUCKET_ID_MODCHECKSUM:  6f500b

BUCKET_ID_MODVER_STR:  4.7.2558.0

BUCKET_ID_PREFIX_STR:  APPLICATION_FAULT_INVALID_POINTER_READ_NOSOS_AVRF_

FAILURE_PROBLEM_CLASS:  APPLICATION_FAULT

FAILURE_SYMBOL_NAME:  clr.dll!ObjHeader::PassiveGetSyncBlock

WATSON_STAGEONE_URL:  http://watson.microsoft.com/StageOne/Application.exe/2014.0.0.0/5c1be45d/clr.dll/4.7.2558.0/59d413ce/c0000005/00019ab5.htm?Retriage=1

TARGET_TIME:  2019-05-29T01:31:32.000Z

OSBUILD:  14393

OSSERVICEPACK:  0

SERVICEPACK_NUMBER: 0

OS_REVISION: 0

OSPLATFORM_TYPE:  x86

OSNAME:  Windows 10

OSEDITION:  Windows 10 WinNt SingleUserTS

USER_LCID:  0

OSBUILD_TIMESTAMP:  2016-07-15 21:33:42

BUILDDATESTAMP_STR:  160715-1616

BUILDLAB_STR:  rs1_release

BUILDOSVER_STR:  10.0.14393.0

ANALYSIS_SESSION_ELAPSED_TIME:  a0a5

ANALYSIS_SOURCE:  UM

FAILURE_ID_HASH_STRING:  um:invalid_pointer_read_nosos_avrf_c0000005_clr.dll!objheader::passivegetsyncblock

FAILURE_ID_HASH:  {06a434b5-8248-29d6-ae64-af6cf7057728}

Followup:     MachineOwner
---------
krwq commented 5 years ago

@EK2017 do you perhaps have couple of other stacks as well? Could you put similar info from different dump on gist? (just wanted to compare stacks and see if they look similar or only part of the stack is similar). Would be useful to somehow figure out what was clr.dll!objheader::passivegetsyncblock trying to read when it AVed (not sure if you got symbols to get specific info on what exactly was going on just before the crash).

Overall this seems to be happening while freeing native overlapped but interestingly FreeNativeOverlapped frees the pin handle (at least on .NET Core) so presumably it was pinned (unless it was null): https://github.com/dotnet/coreclr/blob/9bd2787a9dd2aa4d2b7d4f72afebc3dbe896e896/src/vm/nativeoverlapped.cpp#L162

EK2017 commented 5 years ago

It may be easier if I send you the dumps so you can look at the objects. They don't have any user information. Please let me know. If not, I'll post more stack traces.

krwq commented 5 years ago

@EK2017 sounds good to me, please share, I'll take a look at them whenever I got time (might not happen tomorrow)

EK2017 commented 5 years ago

I've sent the instructions to download the dumps to your email. Please let me know if you have trouble downloading. Thanks a lot for looking into this.

krwq commented 5 years ago

Thanks @EK2017! Will take a look later this week. Meantime: @leculver - did you perhaps see similar crashes elsewhere in the framework?

krwq commented 5 years ago

@EK2017 I've looked at the dumps and my initial assessment is that they seem to be two separate JIT issues (3960 seems separate issue, remaining 3 dumps seem to be the same). I've forwarded the dumps to JIT team for further investigation. I'll let you know if I have any updates and if this is something they're planning to address.

Meanwhile it might be worth to test the code with .NET Core and check if the issues are not already fixed there

krwq commented 5 years ago

@EK2017 we have done some internal analysis of the dumps and here are our finds:

This type of failure looks like something outside of JIT is corrupting the heap and happens to manifest itself as JIT failure

Here are couple of things to try:

meanwhile we are also trying to get some more hardware to see if we can repro issue ourselves.

EK2017 commented 5 years ago

Krzysztof, Thanks for looking into it further. I have uploaded 5 more dumps from different machines. I’ve also enabled the server GC and will keep you updated on that. As far as the native overlapped structure, I don’t have any more info on this. I just noticed that the objects are not pinned, but I was not sure about it. Have you concluded that the code is correct? We can’t really simplify the software, everything is needed to operate the system. We were unable to repro in a simulated environment. I am struggling with trying to run the software in .NET Core. The 3rd party libraries are written in full .NET and I only have the dll files. Thanks

krwq commented 5 years ago

@EK2017 I have briefly took a look at native overlapped and it looked correct to me then I forwarded to someone more familiar and they also said it looks correct to them - could you point to specific place in the code you think we should be pinning? (excluding possibility we missed something).

For the dumps, we won't be able to take a look at them for at least the next week or two.

No worries on the .NET Core repro.

For reference, does your app have any native code or unsafe code sections (or one of the dependencies perhaps if you are aware of any)? Trying to exclude possibility of anything which is not related to the product itself?

Also do you have a picture or model of specific hardware are you using for serial port communication? (assuming you can share the info) There is always a possibility that the hardware driver itself has an issue and it would be interesting to validate if the issue is reproing with different hardware.

Out of curiosity, assuming your 3rd party dependencies are public could you perhaps share what the libraries you are depending on are? (just wondering if I can in any way help with effort to moving to .NET Core)

EK2017 commented 5 years ago

I was thinking IOCallback needs to be pinned inthe SerialStream as it gets passed to a native code. One of the library that we are using contains unsafe code. I went through the source code and could not find anything wrong with it. Also, the software never crashes in that library. We are using Minnowboard Turbot which has a build in COM port (it's part of the Intel Atom CPU). Some of the libraries are proprietary, so we will have to either buy the source code or ask the vendor to recompile for .NET core. I'll send you the list shortly if you are curious.

krwq commented 5 years ago

@EK2017 Thanks, I'm specifically curious about the ones which you cannot use on .NET Core