dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.92k stars 4.64k forks source link

Initializing RuntimeMetrics #105845

Closed noahfalk closed 1 month ago

noahfalk commented 1 month ago

Description

We recently added RuntimeMetrics, part of our long-running plan to offer all runtime metrics using OpenTelemetry standardized semantic conventions + the new Meter API added in .NET 6. For most metrics it is expected the Meter gets created either because the developer directly invoked new Meter(), or they invoked some higher level API that indirectly creates the Meter on their behalf. However we expect RuntimeMetrics to be available in all .NET apps by default without the developer invoking any explicit API to match the existing behavior available with runtime EventCounters. In particular we expect this repro scenario to work:

Repro

  1. Create a new console app using dotnet new console and modify it so that it doesn't exit immediately like this:
    Console.WriteLine("Hello, World!");
    Console.ReadLine();
  2. dotnet run
  3. In a separate console run dotnet-counters monitor -n <name_of_app>

Expected behavior dotnet-counters should display the new metrics:

Press p to pause, r to resume, q to quit.
    Status: Running

[System.Runtime]
    dotnet.assembly.count ({assembly})                                    14
    dotnet.gc.collections ({collection} / 1 sec)
        gc.heap.generation=gen0                                            0
        gc.heap.generation=gen1                                            0
        gc.heap.generation=gen2                                            0
  ...

Actual behavior dotnet-counters shows the pre-existing EventCounters

Press p to pause, r to resume, q to quit.
    Status: Running

[System.Runtime]
    % Time in GC since last GC (%)                                         0
    Allocation Rate (B / 1 sec)                                            0
    CPU Usage (%)                                                          0
    Exception Count (Count / 1 sec)                                        0
    GC Committed Bytes (MB)                                                0
    GC Fragmentation (%)                                                   0
    GC Heap Size (MB)                                                      0.938
    ...

Cause

In order to create the RuntimeMetrics automatically the current implementation initializes the Meter in response to any MeterListener being created. The premise was that it only matters if the Meter exists when something is capable of listening to it. For dotnet-counters we expect MetricsEventSource to create a MeterListener whenever dotnet-counters starts a new monitoring session. OnEventCommand triggers the lazy creation of the CommandHandler, which creates AggregationManager, which creates MeterListener. Unforetunately we overlooked that MetricsEventSource itself isn't created unless a Meter exists. This made a chicken-egg situation where RuntimeMetrics Meter depends on MetricsEventSource being created first, but MetricsEventSource is depending on some Meter, such as RuntimeMetrics being created first. Many apps would have code that create some other Meter which breaks the cycle and causes everything to get bootstrapped but a simple console app doesn't do that.

Proposed solution

We need some code in the initialization path of .NET Core apps to ensure that MetricsEventSource is created, similar to what we already do for RuntimeEventSource. Since loading MetricsEventSource will cause another assembly to load (System.Diagnostics.DiagnosticSource) this has some startup performance overhead. We planned to mitigate this overhead by checking both EventSource.IsSupported and the AppContext switch System.Diagnostics.Metrics.Meter.IsSupported before doing the load. This would let form factors that are particularly size or startup cost sensitive to opt out of this as they already do for other EventSource/Meter related overhead. For apps that didn't opt-out we would then use reflection to get a delegate for some well-known non-public method in DiagnosticSource.dll, similar to the approach we use for StackTraceSymbols. I see that currently CoreCLR, NativeAOT, and Mono all use slightly different approaches for getting RuntimeEventSource initialized (Mono, NativeAOT, CoreCLR). I assume we'd also do a little bit of refactoring to create some common EventSource initialization method that handles both RuntimeEventSource and MetricsEventSource.

Questions/Feedback

@brianrob @jkotas @MichalStrehovsky @LakshanF @tarekgh

jkotas commented 1 month ago

Does UnsafeAccessor work to load a type in a not yet loaded assembly?

It does not. You have to use regular reflection (Type.GetType).

jkotas commented 1 month ago

known non-public method in DiagnosticSource.dll, similar to the approach we use for StackTraceSymbols.

StackTraceSymbols helper only ever ships with the runtime. It is not OOB. It makes the private method more acceptable - there are no concerns about version mixing and matching.

DiagnosticSource is OOB. You need to worry about versions mixing and matching (e.g. .NET 9 runtime using .NET 10 version of DiagnosticSource). We tend to avoid hidden contracts for OOB. You can certainly do it, but our infrastructure won't detect when you make a mistake and remove the method by accident.

tarekgh commented 1 month ago

Can we use any public type in DiagnosticSource (like Activity) for binding from corelib? and then having DiagnsoticSource uses ModuleInitializer to force initializing the MetricEventSource when loading?

noahfalk commented 1 month ago

I have a PR ready for review to resolve this.

Can we use any public type in DiagnosticSource (like Activity) for binding from corelib? and then having DiagnsoticSource uses ModuleInitializer to force initializing the MetricEventSource when loading?

I didn't go that route because the deferred loading mechanism I created requires that I can get a reference to the EventSource being created. Using a DiagnosticSource ModuleInitializer would get MetricsEventSource created, but wouldn't make the reference to it available. If you think there is a better option I'm still glad to hear it - we should probably move discussion about implementation over to the PR though.