Open rchoffardet opened 9 months ago
Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.
Author: | rchoffardet |
---|---|
Assignees: | - |
Labels: | `area-System.Threading`, `tenet-performance`, `untriaged` |
Milestone: | - |
@rchoffardet thank you for the test and reported issue! There is a global lock in the exception stack trace building, maybe that's the source of the contention identified by the attached test. It is something I am planning to get rid of. I'll run the test locally to get full understanding of where the issue is.
Alongside my tests, I noticed that when I retrieve the stack trace, it takes more time and it allocates more, so I naively thought that the stack trace wouldn't be build if not requested. Maybe that's related to the string building and not the stack trace itself.
The repro has a bug. It doesn't zero the totalTimeNs after each run for a specific thread count, so the result just keeps accumulating the time. After fixing it, the results were more or less the same (growing very slightly) for the number of threads upto the # of cores on Windows. On Linux, the results are growing with thread count though, that's worth looking into.
Sorry about that, I "extracted" this gist from a bigger script and made mistakes along the way. It should be fixed now. I might be missing something as I still have growing numbers on Windows / .NET 9.0.0 preview 6. Could you share your fix?
My fix was to set totalTimeNS=0 after the Console.WriteLine, which should effectively have the same effect as your fix. As for the performance, it is interesting though. This is from my AMD Ryzen 5 5600X 6-Core Processor:
2924,37
3271,7
3037,3133333333335
3272,7475
3481,018
3965,5066666666667
And this is from my Intel i9-9900X 10 physical cores CPU :
3764.84
4649.17
5072.29
5374.31
7517.652
6482.285
10037.368571428571
10261.98375
8214.833333333334
9332.818
Clearly, the intel is scaling worse than AMD here, I was testing it on the AMD before
Moving further investigations to 10 since we don't have time left for it in the 9 timeframe and it is not critical.
Description
I don't know if that's expected. Feel free to close the issue if it is.
We identified some thread contention while throwing and catching exceptions on multiple threads. The methodology was to simply throw 10_000 exceptions in a multithreaded context. I was expecting that the CPU time spent per exception would stay relatively stable but it's increasing instead. If you do the same thing with method calls, the CPU time per calls stays flat.
Configuration
Windows 11 (10.0.22631.3007/23H2/2023Update/SunValley3) 12th Gen Intel Core i7-12700H, 1 CPU, 20 logical and 14 physical cores .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
Regression?
I don't know
Data
Code
https://gist.github.com/rchoffardet/228f4bf1892e403c65487fcfe46afe35
Result