dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

Add missing GC and Threads performance counters in Net-6 #97229

Open Tejasri12 opened 9 months ago

Tejasri12 commented 9 months ago

We are migrating our service from Net Framework to Net-6.0 and found that below list of Performance Counters that we monitor in Net Framework are not present in Net 6.0

  1. Gen 0 Promoted Bytes/Sec
  2. Gen 1 Promoted Bytes/Sec
  3. Promoted Finalization-Memory from Gen 0
  4. # of current logical Threads
  5. # of Pinned Objects

We are pretty secured service, having performance counters is the easy way to be sure nothing is broken (taking memory dump is not allowed and limitations on RDP to the boxes). Historically we had cases where counters were critical for resolution of incidents (some counters were added after we got issue and missed it due to no counters).

Could you please add the above Performance Counters in Net-6. Thanks in advance!

cc @tommcdon

ghost commented 9 months ago

Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.

Issue Details
We are migrating our service from Net Framework to Net-6.0 and found that below list of Performance Counters that we monitor in Net Framework are not present in Net 6.0 1. Gen 0 Promoted Bytes/Sec 2. Gen 1 Promoted Bytes/Sec 3. Promoted Finalization-Memory from Gen 0 4. \# of current logical Threads 5. \# of Pinned Objects We are pretty secured service, having performance counters is the easy way to be sure nothing is broken (taking memory dump is not allowed and limitations on RDP to the boxes). Historically we had cases where counters were critical for resolution of incidents (some counters were added after we got issue and missed it due to no counters). Could you please add the above Performance Counters in Net-6. Thanks in advance! cc @tommcdon
Author: Tejasri12
Assignees: -
Labels: `area-Diagnostics-coreclr`, `untriaged`
Milestone: -
HighPerfDotNet commented 9 months ago

I had same issue (those counters not available in good old perfmon), looks like the current viable solution is to use dotnet counters -

https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-counters

You'll have to add support for it in your application also - Meter class from System.Diagnostics.Metrics

Details: https://learn.microsoft.com/en-us/dotnet/core/diagnostics/metrics-instrumentation

This does not seem to work in Native AOT builds.

jkotas commented 9 months ago

This does not seem to work in Native AOT builds.

The runtime diagnostic instrumentation is off by default for native AOT console apps. You can enable it by setting <EventSourceSupport>true</EventSourceSupport> property in your project file: https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/diagnostics#observability-and-telemetry. (This is not necessary for ASP.NET Core. The runtime diagnostic instrumentation is on by default for ASP.NET Core native AOT app.)

tommcdon commented 9 months ago

@noahfalk

noahfalk commented 9 months ago

Hi @Tejasri12, thanks for the feedback! We don't generally add features retroactively to versions of .NET we already shipped, but we can preserve this request to add counters to new versions of .NET that are still in development.

If you'd like to write a little bit of code you should be able to add some of these metrics to your own application without waiting on a future .NET version. A quick and dirty example looks like this:

using System.Diagnostics;
using System.Diagnostics.Metrics;

class Program
{
    static void Main(string[] args)
    {
        Meter m = new Meter("MyCustomMetrics");
        var threadCounter = m.CreateObservableUpDownCounter<int>("Thread Count", () =>
            Process.GetCurrentProcess().Threads.Count);
        var promotedBytesCounter = m.CreateObservableUpDownCounter("Pinned objects", () =>
            GC.GetGCMemoryInfo().PinnedObjectsCount);

        // go do whatever other work your app will do. 
    }
}

Then in dotnet-counters (or any alternate metric collection tool):

> dotnet-counters monitor -n ConsoleApp5 --counters MyCustomMetrics

Press p to pause, r to resume, q to quit.
    Status: Running

[MyCustomMetrics]
    Pinned objects                                                         2
    Thread Count                                                          12

Promoted bytes aren't as straightforward unfortunately but it should be possible if you want to do a little more work. There is an API PromotedBytes that can show you how many promoted bytes occurred in the last GC of different generations. Calling that API periodically would give you a statistical sample. Alternately there is the EventListener API where you could subscribe to events from each GC and keep track of how many total bytes are being promoted yourself. I think the event that has that data is the HeapStats_v2 event. Be aware that using EventListener is going to allocate an in-memory buffer where the events are queued up. That might matter if your app has a tight virtual memory constraint.

Popping up a level I'll also throw out that you may not need a counter monitoring promoted bytes if you have a counter monitoring how many GCs are occuring or GC pauses. Often I find the only reason people care about promoted bytes is because promotion implies higher generation GCs are going to run and what they really care about is the number of those GCs or how the pause times of those GCs are affecting their application latency. If that is your situation then you might skip the middleman and measure the GC counts and pauses directly. GC counts: https://learn.microsoft.com/en-us/dotnet/api/system.gc.collectioncount?view=net-5.0 GC pauses: https://learn.microsoft.com/en-us/dotnet/api/system.gc.gettotalpauseduration?view=net-8.0

Hope that helps a bit!

tyedulapuram commented 9 months ago

@noahfalk Thank you for the update. Currently for collecting the Net 6 Runtime counters we consume the Event counter values via the EventListener API and publishing the counters to Geneva monitoring platform and Kusto logs. We are not using dotnet-counters tool. EventListener subscribing to GCHeapStats_V2 Event would help with publishing Gen Promoted Bytes/Sec. Thank you.

Regarding using Process.GetCurrentProcess().Threads.Count for metric # of current logical Threads : Process.GetCurrentProcess().Threads.Count returns the number of operating system threads associated with the current process, not specifically the managed threads managed by the .NET runtime, includes all threads, including native OS threads(unmanaged threads). But the .NET framework performance counter # of current logical Threads only gives number of current managed thread objects in the application. Also, I believe the existing performance counter \\Process(myProcess)\Thread Count would match Process.GetCurrentProcess().Threads.Count .

noahfalk commented 9 months ago

But the .NET framework performance counter # of current logical Threads only gives number of current managed thread objects in the application.

Yep, you are correct. If your application runs threads that execute solely in native code then they will be included in Process.GetCurrentProcess().Threads.Count but wouldn't be in that .NET Framework counter. For many apps there is little difference between the two numbers but for some specific workloads the difference could be more significant. Unfortunately for .NET Core there is no perfect equivalent at the moment. Your other alternative would be to look at ThreadPool.ThreadCount (System.Runtime threadpool-thread-count counter) which is a subset of all managed threads.

Currently for collecting the Net 6 Runtime counters we consume the Event counter values via the EventListener API and publishing the counters to Geneva monitoring platform and Kusto logs.

That approach of using EventListener to proxy counters over to Geneva specific APIs (IFx?) still works, but if you are interested I'd no longer consider it the simplest option. It was a stop-gap technique while OpenTelemetry support wasn't yet available. OpenTelemetry now has direct support for exporting data to Geneva if you would like to retire your EventListener adapter. https://eng.ms/docs/products/geneva/collect/instrument/opentelemetrydotnet/otel-metrics has more info about how that works.

Tejasri12 commented 8 months ago

OpenTelemetry now has direct support for exporting data to Geneva if you would like to retire your EventListener adapter.

We are using OpenTelemetry to collect Net 6 metrics but it doesn't have support to write to Kusto(Azure Data explorer). Earlier, With NetFramwork we were collecting metrics in Geneva and Kusto. Having metrics in Kusto helped us run queries for Performance metric review/operations. This was the reason to use EventListener to collect metrics in Kusto. Also OpenTelemetry doesn't support collecting metrics from EventSource "Microsoft-Windows-DotNETRuntime"