dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.04k stars 4.68k forks source link

[Profiler] Ability to avoid `ICorProfilerCallback::ObjectsAllocatedByClass` callback when `COR_PRF_MONITOR_GC` is set. #108230

Open ww898 opened 5 days ago

ww898 commented 5 days ago

Hi, this is a proposal to speed up the profiler in case the profiler does not use ICorProfilerCallback::ObjectsAllocatedByClass callback for its work, but wants to use other GC callbacks activated by COR_PRF_MONITOR_GC. We can completely disable the DiagWalkHeap(&AllocByClassHelper, (void *)&context, 0, false) call in this case. For that I propose to create a separate additional flag in COR_PRF_HIGH_MONITOR, for example COR_PRF_HIGH_MONITOR_GC_SKIP_ALLOCATED_BY_CLASS_STATISTIC.

The reason: in according to my measurements on .NET 8.0.8 x64, DiagWalkHeap sometimes takes from 18 to 35 seconds on my scenario with the server GC. I measured the time between ICorProfilerCallback2::GarbageCollectionStarted and ICorProfilerCallback::ObjectsAllocatedByClass callbacks.

The original code: https://github.com/dotnet/runtime/blob/080fcae7eaa8367abf7900e08ff2e52e3efea5bf/src/coreclr/vm/gcenv.ee.cpp#L813-L822

P.S. Discussion is welcome...

dotnet-policy-service[bot] commented 5 days ago

Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.

ww898 commented 4 days ago

This is the two charts measured by me with the server GC enabled. I used completely the empty profiler: no any actions in callback except minimum logs. Horizontal axis is GC number. The first chart show time in mS from GarbageCollectionStarted callback to ObjectsAllocatedByClass and GarbageCollectionFinished callbacks. The second is GarbageCollectionFinished-ObjectsAllocatedByClass time. image