dadhi / DryIoc

DryIoc is fast, small, full-featured IoC Container for .NET
MIT License
1.01k stars 122 forks source link

Improve parallelism / reduce thread blocking #137

Closed Havunen closed 5 years ago

Havunen commented 5 years ago

Hey,

We are using DryIoc in web application which is very multi threaded environment. When doing load testing in our application I noticed DryIoc is locking quite a lot. 19% of threads are blocked.

This report is based on analyzing w3wp process during high load. I used windows task manager to take memory dump of the process and analysed it using DebugDiagx64 (Microsoft tool)

Here are the results:

image

Thread 30 - System ID 10676

Entry point clr!Thread::intermediateThreadProc Create time 6/12/2019 5:57:28 PM Time spent in user mode 0 Days 00:00:03.015 Time spent in kernel mode 0 Days 00:00:00.765

This thread is not fully resolved and may or may not be a problem. Further analysis of these threads may be required.

The largest lock is here, pointing to dryioc .Net Call Stack

[[PrestubMethodFrame] (.)] DynamicClass.() 
DryIoc.Scope.TryGetOrAdd(ImTools.ImMap`1, Int32, DryIoc.CreateScopedValue, Int32)+f8 
DryIoc.Scope.GetOrAdd(Int32, DryIoc.CreateScopedValue, Int32)+67 
DynamicClass.(ArrayClosure)+2525 
DryIoc.Scope.TryGetOrAdd(ImTools.ImMap`1, Int32, DryIoc.CreateScopedValue, Int32)+f8 
DryIoc.Scope.GetOrAdd(Int32, DryIoc.CreateScopedValue, Int32)+67 
DynamicClass.(ArrayClosure)+d3e46 
DryIoc.Scope.TryGetOrAdd(ImTools.ImMap`1, Int32, DryIoc.CreateScopedValue, Int32)+f8 
DryIoc.Scope.GetOrAdd(Int32, DryIoc.CreateScopedValue, Int32)+67 
DynamicClass.(ArrayClosure, DryIoc.IResolverContext)+13abe 
DryIoc.Container.DryIoc.IResolver.Resolve(System.Type, System.Object, DryIoc.IfUnresolved, System.Type, DryIoc.Request, System.Object[])+37a 
DryIoc.Container.ResolveAndCacheFactoryDelegate(System.Type, DryIoc.IfUnresolved)+140 
Severa.Rest.API.DryIocControllerActivator.Create(System.Net.Http.HttpRequestMessage, System.Web.Http.Controllers.HttpControllerDescriptor, System.Type)+52 
System.Web.Http.Dispatcher.HttpControllerDispatcher+d__15.MoveNext()+12d 
System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].Start[[System.Web.Http.Dispatcher.HttpControllerDispatcher+d__15, System.Web.Http]](d__15 ByRef)+7a 
System.Web.Http.Dispatcher.HttpControllerDispatcher.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+83 
System.Net.Http.HttpMessageInvoker.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+64 
System.Web.Http.Dispatcher.HttpRoutingDispatcher.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+257 
System.Net.Http.DelegatingHandler.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+31 
DryIoc.WebApi.RegisterRequestMessageHandler.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+29 
System.Net.Http.DelegatingHandler.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+31 
System.Web.Http.HttpServer+d__24.MoveNext()+1a0 
System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].Start[[System.Web.Http.HttpServer+d__24, System.Web.Http]](d__24 ByRef)+7a 
System.Web.Http.HttpServer.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+83 
System.Net.Http.HttpMessageInvoker.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+64 
System.Web.Http.Owin.HttpMessageHandlerAdapter+d__20.MoveNext()+25a 
System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[System.Web.Http.Owin.HttpMessageHandlerAdapter+d__20, System.Web.Http.Owin]](d__20 ByRef)+80 
System.Web.Http.Owin.HttpMessageHandlerAdapter.InvokeCore(Microsoft.Owin.IOwinContext, Microsoft.Owin.IOwinRequest, Microsoft.Owin.IOwinResponse)+7b 
Microsoft.Owin.Security.Infrastructure.AuthenticationMiddleware`1+d__5[[System.__Canon, mscorlib]].MoveNext()+229 
System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Microsoft.Owin.Security.Infrastructure.AuthenticationMiddleware`1+d__5[[System.__Canon, mscorlib]], Microsoft.Owin.Security]](d__5 ByRef)+a4 
Microsoft.Owin.Security.Infrastructure.AuthenticationMiddleware`1[[System.__Canon, mscorlib]].Invoke(Microsoft.Owin.IOwinContext)+9e 
Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage+d__7.MoveNext()+3d 
System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage+d__7, Microsoft.Owin.Host.SystemWeb]](d__7 ByRef)+80 
Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage.RunApp(System.Func`2,System.Threading.Tasks.Task>, System.Collections.Generic.IDictionary`2, System.Threading.Tasks.TaskCompletionSource`1, Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.StageAsyncResult)+80 
Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage.BeginEvent(System.Object, System.EventArgs, System.AsyncCallback, System.Object)+27a 
System.Web.HttpApplication+AsyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()+182 
System.Web.HttpApplication+<>c__DisplayClass285_0.b__0()+25 
System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep)+9b 
System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)+83 
System.Web.HttpApplication+PipelineStepManager.ResumeSteps(System.Exception)+74d 
System.Web.HttpApplication.BeginProcessRequestNotification(System.Web.HttpContext, System.AsyncCallback)+83 
System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest, System.Web.HttpContext)+1e2 
System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)+441 
System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)+13 
DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)+52 
DomainNeutralILStubClass.IL_STUB_PInvoke(IntPtr, System.Web.RequestNotificationStatus ByRef)+7e 
[[InlinedCallFrame] (System.Web.Hosting.UnsafeIISMethods.MgdIndicateCompletion)] System.Web.Hosting.UnsafeIISMethods.MgdIndicateCompletion(IntPtr, System.Web.RequestNotificationStatusByRef) 
System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)+504 
System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)+13 
DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)+52 
[[ContextTransitionFrame]] 

Full Call Stack

ntdll!NtWaitForAlertByThreadId+14    
ntdll!RtlpWaitOnAddressWithTimeout+43    
ntdll!RtlpWaitOnAddress+b2    
ntdll!RtlpWaitOnCriticalSection+db    
ntdll!RtlpEnterCriticalSectionContended+cc    
ntdll!RtlEnterCriticalSection+40    
clr!CrstBase::SpinEnter+ac    
clr!CrstBase::Enter+10d    
clr!MethodDesc::MakeJitWorker+297    
clr!MethodDesc::DoPrestub+94c    
clr!PreStubWorker+3cc    
clr!ThePreStub+55    
DryIoc.Scope.TryGetOrAdd(ImTools.ImMap`1<System.Object>, Int32, DryIoc.CreateScopedValue, Int32)+f8    
DryIoc.Scope.GetOrAdd(Int32, DryIoc.CreateScopedValue, Int32)+67    
DynamicClass.(ArrayClosure)+2525    
DryIoc.Scope.TryGetOrAdd(ImTools.ImMap`1<System.Object>, Int32, DryIoc.CreateScopedValue, Int32)+f8    
DryIoc.Scope.GetOrAdd(Int32, DryIoc.CreateScopedValue, Int32)+67    
DynamicClass.(ArrayClosure)+d3e46    
DryIoc.Scope.TryGetOrAdd(ImTools.ImMap`1<System.Object>, Int32, DryIoc.CreateScopedValue, Int32)+f8    
DryIoc.Scope.GetOrAdd(Int32, DryIoc.CreateScopedValue, Int32)+67    
DynamicClass.(ArrayClosure, DryIoc.IResolverContext)+13abe    
DryIoc.Container.DryIoc.IResolver.Resolve(System.Type, System.Object, DryIoc.IfUnresolved, System.Type, DryIoc.Request, System.Object[])+37a    
DryIoc.Container.ResolveAndCacheFactoryDelegate(System.Type, DryIoc.IfUnresolved)+140    
Severa.Rest.API.DryIocControllerActivator.Create(System.Net.Http.HttpRequestMessage, System.Web.Http.Controllers.HttpControllerDescriptor, System.Type)+52    
System.Web.Http.Dispatcher.HttpControllerDispatcher+<SendAsync>d__15.MoveNext()+12d    
System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].Start[[System.Web.Http.Dispatcher.HttpControllerDispatcher+<SendAsync>d__15, System.Web.Http]](<SendAsync>d__15 ByRef)+7a    
System.Web.Http.Dispatcher.HttpControllerDispatcher.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+83    
System.Net.Http.HttpMessageInvoker.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+64    
System.Web.Http.Dispatcher.HttpRoutingDispatcher.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+257    
System.Net.Http.DelegatingHandler.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+31    
DryIoc.WebApi.RegisterRequestMessageHandler.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+29    
System.Net.Http.DelegatingHandler.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+31    
System.Web.Http.HttpServer+<SendAsync>d__24.MoveNext()+1a0    
System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].Start[[System.Web.Http.HttpServer+<SendAsync>d__24, System.Web.Http]](<SendAsync>d__24 ByRef)+7a    
System.Web.Http.HttpServer.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+83    
System.Net.Http.HttpMessageInvoker.SendAsync(System.Net.Http.HttpRequestMessage, System.Threading.CancellationToken)+64    
System.Web.Http.Owin.HttpMessageHandlerAdapter+<InvokeCore>d__20.MoveNext()+25a    
System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[System.Web.Http.Owin.HttpMessageHandlerAdapter+<InvokeCore>d__20, System.Web.Http.Owin]](<InvokeCore>d__20 ByRef)+80    
System.Web.Http.Owin.HttpMessageHandlerAdapter.InvokeCore(Microsoft.Owin.IOwinContext, Microsoft.Owin.IOwinRequest, Microsoft.Owin.IOwinResponse)+7b    
Microsoft.Owin.Security.Infrastructure.AuthenticationMiddleware`1+<Invoke>d__5[[System.__Canon, mscorlib]].MoveNext()+229    
System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Microsoft.Owin.Security.Infrastructure.AuthenticationMiddleware`1+<Invoke>d__5[[System.__Canon, mscorlib]], Microsoft.Owin.Security]](<Invoke>d__5<System.__Canon> ByRef)+a4    
Microsoft.Owin.Security.Infrastructure.AuthenticationMiddleware`1[[System.__Canon, mscorlib]].Invoke(Microsoft.Owin.IOwinContext)+9e    
Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage+<RunApp>d__7.MoveNext()+3d    
System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage+<RunApp>d__7, Microsoft.Owin.Host.SystemWeb]](<RunApp>d__7 ByRef)+80    
Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage.RunApp(System.Func`2<System.Collections.Generic.IDictionary`2<System.String,System.Object>,System.Threading.Tasks.Task>, System.Collections.Generic.IDictionary`2<System.String,System.Object>, System.Threading.Tasks.TaskCompletionSource`1<System.Object>, Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.StageAsyncResult)+80    
Microsoft.Owin.Host.SystemWeb.IntegratedPipeline.IntegratedPipelineContextStage.BeginEvent(System.Object, System.EventArgs, System.AsyncCallback, System.Object)+27a    
System.Web.HttpApplication+AsyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()+182    
System.Web.HttpApplication+<>c__DisplayClass285_0.<ExecuteStepImpl>b__0()+25    
System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep)+9b    
System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)+83    
System.Web.HttpApplication+PipelineStepManager.ResumeSteps(System.Exception)+74d    
System.Web.HttpApplication.BeginProcessRequestNotification(System.Web.HttpContext, System.AsyncCallback)+83    
System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest, System.Web.HttpContext)+1e2    
System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)+441    
System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)+13    
DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)+52    
clr!UMThunkStub+6e    
webengine4!W3_MGD_HANDLER::ProcessNotification+8e    
webengine4!W3_MGD_HANDLER::DoWork+3a5    
webengine4!RequestDoWork+3fc    
webengine4!CMgdEngHttpModule::OnAcquireRequestState+1f    
iiscore!NOTIFICATION_CONTEXT::RequestDoWork+b9    
iiscore!NOTIFICATION_CONTEXT::CallModulesInternal+19c    
iiscore!NOTIFICATION_CONTEXT::CallModules+36    
iiscore!NOTIFICATION_MAIN::DoWork+1cf    
iiscore!W3_CONTEXT_BASE::IndicateCompletion+d6    
webengine4!W3_MGD_HANDLER::IndicateCompletion+5d    
webengine4!MgdIndicateCompletion+22    
DomainNeutralILStubClass.IL_STUB_PInvoke(IntPtr, System.Web.RequestNotificationStatus ByRef)+7e    
System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)+504    
System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)+13    
DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)+52    
clr!UM2MThunk_WrapperHelper+43    
clr!UM2MThunk_Wrapper+60    
clr!Thread::DoADCallBack+13d    
clr!UM2MDoADCallBack+b3    
clr!UMThunkStub+26d    
webengine4!W3_MGD_HANDLER::ProcessNotification+8e    
webengine4!ProcessNotificationCallback+42    
clr!UnManagedPerAppDomainTPCount::DispatchWorkItem+1bc    
clr!ThreadpoolMgr::ExecuteWorkRequest+64    
clr!ThreadpoolMgr::WorkerThreadStart+f5    
clr!Thread::intermediateThreadProc+86    
kernel32!BaseThreadInitThunk+14    
ntdll!RtlUserThreadStart+21 
dadhi commented 5 years ago

Hi, thanks for context and for analysis.The only lock DryIoc is using inside the Scope to ensure single creation of Scoped / Singleton services.

Currently there is one locker object per Scope. So probably it need to be scaled.

Havunen commented 5 years ago

I don't know much about dryIoc, but I solved similar issue previously by using .Net built-in Concurrent* types. Looks like ImMap is tree structure so I don't know if there is any alternative for that or how to make it concurrent.

dadhi commented 5 years ago

It is different. The lock here is used on a purpose. We just need to make it less a bottleneck.

Havunen commented 5 years ago

I have been thinking about this issue lately.

Is the lock there because IMTools AVLTree implementation is not thread safe? There are many white papers about non blocking binary search trees. fe: https://dl.acm.org/citation.cfm?id=1835736

Or is the lock fixing some dryIoc related issue?

I noticed you have done benchmark here: https://github.com/dadhi/DryIoc/blob/aff9a3531e3ea10eae9f8e4e764225dea4542a71/playground/Playground/ImMapBenchmarks.cs

This benchmark however does not consider DryIoc way of using given data structure. ConcurrentDictionary would not need lock for writing / reading. When doing load testing DryIoc's TryGetOrAdd is the slowest single place in our application.

Hi, thanks for context and for analysis.The only lock DryIoc is using inside the Scope to ensure single creation of Scoped / Singleton services.

Maybe this is the reason why this lock matters so much web application. DryIoc WebAPI extension registers Controllers with WebRequestScope, and then we have a lot singleton services.

I think new benchmark is needed which tests IoC containers in multi threaded environment. Maybe it could be something like this:

public void Demo()
{
    // Configs
    // INTEL® XEON® W-3275M PROCESSOR, 28 Cores, 56 Threads
    int threadCount = 56;
    int iterations = 10;
    int i = 0;
    Thread[] threads = new Thread[threadCount];

    // Create threads
    for (i = 0; i < threadCount; i++)
    {
        threads[i] = new Thread(new ThreadStart(delegate ()
        {
            for (int j = 0; j < iterations; j++)
            {
                // Benchmark loop
                // Call method which: Open Scope, Resolve scoped service type with large number of singleton dependencies
            }
        }));
    }

    // Start all
    for (i = 0; i < threadCount; i++)
    {
        threads[i].Start();
    }

    // Join all
    for (i = 0; i < threadCount; i++)
    {
        threads[i].Join();
    }
}

What do you think?

Havunen commented 5 years ago

The sample code above could be also used to test thread safety. Inside the loop we would add results into concurrentStack/concurrentDictonary and after test is finished verify it resolved services correctly. Then we change thread count to insanely big number to make sure collisions happen.

dadhi commented 5 years ago

Hi @Havunen

Is the lock there because IMTools AVLTree implementation is not thread-safe? There are many white papers about non-blocking binary search trees.

No, ImMap and ImHashMap are the persistent immutable AVL trees implemented without locks.

Or is the lock fixing some dryIoc related issue?

The lock is by design to prevent the situation when two threads create the same singleton twice. As far as I know, there are no technics to prevent this situation without using locks (e.g. without using Monitor in .NET or conditional variable generally). You may insert spin-wait to minimize lock probability or scale the lock from the one to many (I wanna try it out here) - but you can't avoid locking.

This benchmark however does not consider DryIoc way of using given data structure. ConcurrentDictionary would not need lock for writing / reading.

Benchmark is just for comparison of two collections of different nature, the lock here is just to establish artificial common ground.

When doing load testing DryIoc's TryGetOrAdd is the slowest single place in our application.

I hope to improve TryGetOrAdd performance:

I think new benchmark is needed which tests IoC containers in multi threaded environment. Maybe it could be something like this:

I would only appreciate such a benchmark, maybe a PR? This could be added to playground/Playground project. See for example https://github.com/dadhi/DryIoc/blob/master/playground/Playground/RealisticUnitOfWorkBenchmark.cs

dadhi commented 5 years ago

Consider its fixing by splitting Scope to SingletonScope with 16 locks assigned by hash. Scope wil proceed to use a single lock. In addition, the item storage is also split into 16 buckets in both SingletonScope and Scope.

dadhi commented 5 years ago

Working on #139 I found a better way to scale locks, using Ref slots for storing the created value in the ImMap and then locking on the slot itself. So we don't need a separate locker object and instead scaling to lock per service.

Havunen commented 5 years ago

Btw, have you seen these videos from Federico Lois. He shares some cool optimization techniques for C#.

  1. https://youtu.be/7GTpwgsmHgU
  2. https://youtu.be/eB_S9lQ4c4I
dadhi commented 5 years ago

Yeah, I have saw. Some if them appeared on Ayende's blog first. I think this was brought by development of RavenDB. I have also chat with Federico once regarding DryIoc perf :-)

Havunen commented 5 years ago

Cool :)