dotnet / diagnostics

This repository contains the source code for various .NET Core runtime diagnostic tools and documents.
MIT License
1.19k stars 356 forks source link

Advancing Metrics in .NET #4083

Open noahfalk opened 1 year ago

noahfalk commented 1 year ago

In .NET 6 we added the Meter API, a metrics instrumentation API designed in coordination with OpenTelemetry. The goal was to make it easy to have great metrics monitoring experience for .NET apps using common tools. Although this is now possible to do it isn't yet as easy, widespread, or powerful as we would like. These are my thoughts on next steps. Feedback is welcome if you think the goals should be adjusted, there are issues I've missed, or anything else. Thanks!

Goals

  1. It is easy to set up an ASP.NET Core web app and monitor both platform metrics and custom metrics using (OpenTelemetry or dotnet-monitor) + Grafana. A default Grafana dashboard should be available as a simple starting point.
  2. Using the Meter API in .NET libraries and apps is easy with idiomatic API patterns and documented best practice.
  3. Basic platform level metrics desired by most apps are already instrumented out-of-the-box.

Issues to Address

  1. We have general usage guidance but we need to create usage guidance specific to ASP.NET Core. ASP.NET Core has distinct differences in what code patterns are considered idiomatic. We will may find that it is hard to make the existing API appear idiomatic in ASP.NET Core in which case some judicious use of new APIs may be necessary. (Related: https://github.com/dotnet/runtime/issues/77514)
  2. We need to identify any critical missing metrics and add them, for example a request latency histogram.
  3. Existing instrumentation in .NET has been implemented as EventCounters. We need to define the path forward both for new instrumentation and pre-existing instrumentation. (https://github.com/dotnet/aspnetcore/issues/33387 and https://github.com/dotnet/runtime/issues/79371)
  4. We either need to identify an existing Grafana dashboard we can re-use at https://grafana.com/grafana/dashboards, or create a new one. We should also update our tutorial docs to show how to set it up.

cc @samsp-msft @davidfowl @tarekgh @reyang

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-meta See info in area-owners.md if you want to be subscribed.

Issue Details
In .NET 6 we added the [Meter API](https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.metrics?view=net-7.0), a metrics instrumentation API designed in coordination with OpenTelemetry. The goal was to make it easy to have great metrics monitoring experience for .NET apps using common tools. Although this is now [possible to do](https://learn.microsoft.com/dotnet/core/diagnostics/metrics-collection) it isn't yet as easy, widespread, or powerful as we would like. These are my thoughts on next steps. Feedback is welcome if you think the goals should be adjusted, there are issues I've missed, or anything else. Thanks! ## Goals 1. It is easy to set up an ASP.NET Core web app and monitor both platform metrics and custom metrics using (OpenTelemetry or dotnet-monitor) + Grafana. A default Grafana dashboard should be available as a simple starting point. 2. Using the Meter API in .NET libraries and apps is easy with idiomatic API patterns and documented best practice. 3. Basic platform level metrics desired by most apps are already instrumented out-of-the-box. ## Issues to Address 1. We have [general usage guidance](https://learn.microsoft.com/dotnet/core/diagnostics/metrics-instrumentation) but we need to create usage guidance specific to ASP.NET Core. ASP.NET Core has distinct differences in what code patterns are considered idiomatic. We will may find that it is hard to make the existing API appear idiomatic in ASP.NET Core in which case some judicious use of new APIs may be necessary. (Related: https://github.com/dotnet/runtime/issues/77514) 2. We need to identify any critical missing metrics and add them, for example a request latency histogram. 3. Existing instrumentation in .NET has been implemented as EventCounters. We need to define the path forward both for new instrumentation and pre-existing instrumentation. (https://github.com/dotnet/aspnetcore/issues/33387 and https://github.com/dotnet/runtime/issues/79371) 4. We either need to identify an existing Grafana dashboard we can re-use at https://grafana.com/grafana/dashboards, or create a new one. We should also update [our tutorial docs](https://learn.microsoft.com/en-us/dotnet/core/diagnostics/metrics-collection#view-metrics-in-grafana-with-opentelemetry-and-prometheus) to show how to set it up. cc @samsp-msft @davidfowl @tarekgh @reyang
Author: noahfalk
Assignees: -
Labels: `area-Meta`, `untriaged`
Milestone: -
xakep139 commented 1 year ago

Regarding request latency histogram (issue #2): https://github.com/dotnet/aspnetcore/issues/47536 Final naming is in https://github.com/dotnet/aspnetcore/issues/48536

tommcdon commented 1 year ago

All remaining .NET 8 work is tracked on issues. The remaining item below can be done out of band from the .NET release, so moving to the dotnet/diagnostics repo:

  1. We either need to identify an existing Grafana dashboard we can re-use at https://grafana.com/grafana/dashboards, or create a new one. We should also update our tutorial docs to show how to set it up.