Closed Havunen closed 5 years ago
Hi, thanks for context and for analysis.The only lock DryIoc is using inside the Scope to ensure single creation of Scoped / Singleton services.
Currently there is one locker object per Scope. So probably it need to be scaled.
I don't know much about dryIoc, but I solved similar issue previously by using .Net built-in Concurrent* types. Looks like ImMap
is tree structure so I don't know if there is any alternative for that or how to make it concurrent.
It is different. The lock here is used on a purpose. We just need to make it less a bottleneck.
I have been thinking about this issue lately.
Is the lock there because IMTools AVLTree implementation is not thread safe? There are many white papers about non blocking binary search trees. fe: https://dl.acm.org/citation.cfm?id=1835736
Or is the lock fixing some dryIoc related issue?
I noticed you have done benchmark here: https://github.com/dadhi/DryIoc/blob/aff9a3531e3ea10eae9f8e4e764225dea4542a71/playground/Playground/ImMapBenchmarks.cs
This benchmark however does not consider DryIoc way of using given data structure. ConcurrentDictionary would not need lock for writing / reading. When doing load testing DryIoc's TryGetOrAdd
is the slowest single place in our application.
Hi, thanks for context and for analysis.The only lock DryIoc is using inside the Scope to ensure single creation of Scoped / Singleton services.
Maybe this is the reason why this lock matters so much web application. DryIoc WebAPI extension registers Controllers with WebRequestScope, and then we have a lot singleton services.
I think new benchmark is needed which tests IoC containers in multi threaded environment. Maybe it could be something like this:
public void Demo()
{
// Configs
// INTEL® XEON® W-3275M PROCESSOR, 28 Cores, 56 Threads
int threadCount = 56;
int iterations = 10;
int i = 0;
Thread[] threads = new Thread[threadCount];
// Create threads
for (i = 0; i < threadCount; i++)
{
threads[i] = new Thread(new ThreadStart(delegate ()
{
for (int j = 0; j < iterations; j++)
{
// Benchmark loop
// Call method which: Open Scope, Resolve scoped service type with large number of singleton dependencies
}
}));
}
// Start all
for (i = 0; i < threadCount; i++)
{
threads[i].Start();
}
// Join all
for (i = 0; i < threadCount; i++)
{
threads[i].Join();
}
}
What do you think?
The sample code above could be also used to test thread safety. Inside the loop we would add results into concurrentStack/concurrentDictonary and after test is finished verify it resolved services correctly. Then we change thread count to insanely big number to make sure collisions happen.
Hi @Havunen
Is the lock there because IMTools AVLTree implementation is not thread-safe? There are many white papers about non-blocking binary search trees.
No, ImMap
and ImHashMap
are the persistent immutable AVL trees implemented without locks.
Or is the lock fixing some dryIoc related issue?
The lock
is by design to prevent the situation when two threads create the same singleton twice.
As far as I know, there are no technics to prevent this situation without using locks (e.g. without using Monitor
in .NET or conditional variable generally). You may insert spin-wait to minimize lock probability or scale the lock from the one to many (I wanna try it out here) - but you can't avoid locking.
This benchmark however does not consider DryIoc way of using given data structure. ConcurrentDictionary would not need lock for writing / reading.
Benchmark is just for comparison of two collections of different nature, the lock here is just to establish artificial common ground.
When doing load testing DryIoc's TryGetOrAdd is the slowest single place in our application.
I hope to improve TryGetOrAdd
performance:
TryGetOrAdd
calls for the dependencies. If the dependency is never accessed as root (via Resolve
method) then we don't need to wrap it creation in delegate passed to TryGetOrAdd
. you need the factory delegate only for the root Singleton.disposables
and items
collections inside the Scope
into the one.I think new benchmark is needed which tests IoC containers in multi threaded environment. Maybe it could be something like this:
I would only appreciate such a benchmark, maybe a PR?
This could be added to playground/Playground
project. See for example https://github.com/dadhi/DryIoc/blob/master/playground/Playground/RealisticUnitOfWorkBenchmark.cs
Consider its fixing by splitting Scope to SingletonScope with 16 locks assigned by hash. Scope wil proceed to use a single lock. In addition, the item storage is also split into 16 buckets in both SingletonScope and Scope.
Working on #139 I found a better way to scale locks, using Ref slots for storing the created value in the ImMap and then locking on the slot itself. So we don't need a separate locker object and instead scaling to lock per service.
Btw, have you seen these videos from Federico Lois. He shares some cool optimization techniques for C#.
Yeah, I have saw. Some if them appeared on Ayende's blog first. I think this was brought by development of RavenDB. I have also chat with Federico once regarding DryIoc perf :-)
Cool :)
Hey,
We are using DryIoc in web application which is very multi threaded environment. When doing load testing in our application I noticed DryIoc is locking quite a lot. 19% of threads are blocked.
This report is based on analyzing w3wp process during high load. I used windows task manager to take memory dump of the process and analysed it using DebugDiagx64 (Microsoft tool)
Here are the results:
Thread 30 - System ID 10676
Entry point clr!Thread::intermediateThreadProc Create time 6/12/2019 5:57:28 PM Time spent in user mode 0 Days 00:00:03.015 Time spent in kernel mode 0 Days 00:00:00.765
This thread is not fully resolved and may or may not be a problem. Further analysis of these threads may be required.
The largest lock is here, pointing to dryioc .Net Call Stack
Full Call Stack