show callstacks for induced GCs and large object allocations

Maoni0 commented 2 years ago

these are things that are very practical for perf diagnostics - you'd want to know if there are GCs induced who induced it and often folks would like to know who's allocating large objects.

this is the high level idea (the reason I didn't write down details is I only know the top level idea at this point :)) -

for ETW the IPs in the callstacks are decoded based on rundown events so the idea we run down at the beginning instead of at the end, and then when there are new JIT method events we will add those. and TraceLog can already be used with a real time session.

with EventPipe this may not be straightforward as right now it may not be able to run down at the beginning.

MokoSan commented 2 years ago

Did a SPIKE on this and found the following:

We can make use of the TraceLog.CreateFromTraceEventSession(traceEventSession) method on the current traceEventSession to allow us to use the TraceEvent.CallStack() method.
For the case of Linux / OSX, there isn't functionality to get the TraceLog from the session but it has to be done from an eventpipe data file using: CreateFromEventPipeDataFile(string filePath, string etlxFilePath = null, TraceLogOptions options = null). This might require some more thought in terms of implementation.

Once we get the callstack, we'll need to walk the stack such as:

List<string> cs = new List<string>();
var callstack = data.CallStack();
while (callstack != null)
{
  var module = callstack.CodeAddress.ModuleFile;
  var method = callstack.CodeAddress.FullMethodName;
  if (module != null) 
  {
      string moduleAndMethod = $"{module.Name}!{method}";
      if (!string.IsNullOrWhiteSpace(method))
      {
          cs.Add(moduleAndMethod);
      }
  }

  callstack = callstack.Caller;
}

Symbols must be resolved before walking the stack; this involves initializing the SymbolReader and then calling the LookupSymbolsForModule using the symbol reader for a particular module:

            SymbolReader symbolReader = new SymbolReader(log: TextWriter.Null, SymbolPath.MicrosoftSymbolServerPath); 
            symbolReader.Options = SymbolReaderOptions.CacheOnly;
            symbolReader.SecurityCheck = (pdbPath) => true;

            ....
           For a module:
            traceLog.CodeAddresses.LookupSymbolsForModule(symbolReader, module);

Rundown: Enabling CLR Events to 'catch up' on JIT compiled code in running processes:

session.EnableProvider(ClrRundownTraceEventParser.ProviderGuid, TraceEventLevel.Informational,
            (ulong)(ClrTraceEventParser.Keywords.Jit |          
            ClrTraceEventParser.Keywords.JittedMethodILToNativeMap | 
            ClrTraceEventParser.Keywords.Loader |               
            ClrTraceEventParser.Keywords.StartEnumeration));

Which event will be need to subscribe to for the GC End call stacks?: It wasn't clear to me from the description which event we'll need the callstacks for, I presumed it was the .Clr.MethodJittingStarted event. In the past, I have made use of the Kernel.PerfInfoSample event for the CPU stacks associated with an event for associating events in addition to the Context Switch events for the ThreadTime stacks; seems like the TraceGC event doesn't contain the CallStack() method.

Correct me if I get this wrong but is the idea for each GCEnd event that is associated with Induced GCs / Large Object Allocs, we find the closest callstack and output them using the logic described above?

Maoni0 commented 2 years ago

thanks for doing all this detailed research! :)

@BrianRob would be better at answering those questions than me.

It wasn't clear to me from the description which event we'll need the callstacks for

for induced GCs, it would be the GC/Triggered event which is available with the current events, ie, GC events on informational level.

for large object allocations, it would be the AllocationTick event which means we'd need to collect the verbose level GC events.

brianrob commented 2 years ago

From a quick skim of the details here, I think you have it right. For the stacks involved here, you will want to enable ClrTraceEventParser.Keywords.Stack. This will cause the runtime to emit ClrStack/Walk events which TraceEvent will match up to the events they belong to. You don't need to do anything special to consume them - just call TraceEvent.CallStack() as you would for any other event.

You're right that for native code or precompiled code, you will need to resolve symbols before displaying the stacks. Otherwise, CodeAddress.FullMethodName will only be filled in for jitted methods that we know about.

For non-Windows, you'll need to use EventPipe. I think you can likely use a similiar plan that you've described here, but you'll need to use https://www.nuget.org/packages/Microsoft.Diagnostics.NETCore.Client/ to create the session. The session can then be opened with EventPipeTraceEventSource. I don't think that we have a good way to address call stacks from a live EventPipe session yet - this would be a welcome change to TraceEvent.

Maoni0 / realmon

show callstacks for induced GCs and large object allocations #8