Open kouvel opened 2 years ago
Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.
Author: | kouvel |
---|---|
Assignees: | - |
Labels: | `area-System.Threading` |
Milestone: | 7.0.0 |
What are the hot stacktraces? Would it be better to get rid of this lock in the offending places instead?
From the history I have, it was apparently contention in the type loader, under JIT_GenericHandleClass / JIT_GenericHandle_Framed, probably during startup. I don't have any history on whether an RW lock would benefit over a simpler lock in this case. It may be possible to create a small test case to show the issue and measure.
JIT_GenericHandleClass / JIT_GenericHandle_Framed
Hmm, g_pJitGenericHandleCache
that backs implementation of these APIs is lock-free for read and it does not use SimpleRWLock.
Here's a longer caller stack that I have from before:
- SimpleRWLock::EnterRead
- LoaderAllocator::GetDispatchToken
- VirtualCallStubManager::GetCallStub
- Dictionary::PopulateEntry
- JIT_GenericHandle_Framed
- JIT_GenericHandleClass
SimpleRWLock::EnterRead in that case was using about half the CPU time of CLRLifoSemaphore::Wait, which seemed to be high
If the dispatch token hashtable is hot in some scenarios, we should be able to switch it to be lock-free on read hashtable.
I believe we only use a SimpleRWLock
in those cases when creating/working with a FatDispatchToken, and I removed the need for fat dispatch tokens on 64 bit platforms with PR #65045. My guess is that we probably don't have this particular problem with this callstack on X64 or Arm64 platforms anymore.
SimpleRWLock
seems to be a purely spin-wait-based data structure, with long spin-waits between attempts. In some cases during contention the spin-waiting CPU time is showing up near the top of methods in exclusive CPU time spent. It should be possible to tweak the spin-waits to be shorter, and maybe also back-off to a full wait in some cases.