Particular / NServiceBus

Build, version, and monitor better microservices with the most powerful service platform for .NET
https://particular.net/nservicebus/
Other
2.07k stars 650 forks source link

Metrics Bag simplifies the recording of pipelilne steps' execution metrics. #7064

Closed saratry closed 3 weeks ago

saratry commented 1 month ago

This PR introduces the MetricsBag to collect and record metrics for a single pipeline execution.

The MetricsBag is instantiated by the factory when the pipeline execution starts. It is inserted in the context bag so that it can be retrieved later during the execution of the specific steps.

The usage of a MetricsBag is beneficial as it allows for the collection of tags even in advanced stages, enhancing the quality of all the metrics in the bag. This approach also eliminates the need for reconstructing data from headers, providing direct access to more structured data.

The usage of a MetricsBag allow to collect tags also in advanced stages to the benefit of all the metrics in the bag. This also prevents the need of reconstructing data from headers, having the opportunity to directly access to more structured data.

danielmarbach commented 4 weeks ago

I have been asked to review this code and that's why I'm chiming in sharing my opinion.

Generally, I have some problems with what is being proposed here, as well as in the base branch. From a more general perspective, I think I can summarize my concerns around the two main points:

I'm aware that some of those "problems" have already been around before the changes have been made.

Let me try to elaborate a bit more.

Terminology that somewhat overlaps with existing framework concepts that make it difficult to understand what's going on.

When we look at the guidance of creating meters and also how ASP.NET Core or the .NET runtime uses and declares those, we can see the usage of IMeterFactory. This PR introduces MetricsFactory which clashes with that terminology somewhat and it is unclear we can't use the built-in types directly and what the tradeoffs are we are considering by adding our own custom abstraction

Potentially unnecessary abstractions while we are still evolving our usage of meters

The existing code base has already tried to somehow "bring together" meters into somehow a cross cutting type of thing and we are expanding on that and adding more abstractions that hint at some generic infrastructure required to be able to deal with meters. I fail to see why this is necessary. When you for example, look at the ASP.NET Core and runtime usage of meters, they are usually grouped together by the feature/function area and every TagList required in a functional area is derived from the data available in that feature/function area. Why are we not making things way more concrete? We already have specific parts of the pipeline that concern specific pluggable parts of what will be executed there. It seems we have embarked on a very technical slicing aspect instead of putting things that belong together where they need to be. For example, the histogram counter for the message handler time can only ever be used inside the invoke handler terminator. Because that's where we already have an already instantiated handler that we are going to execute the handler method of. Any metric that is related to executing the handler method timing should be as closely related to that part of the framework as possible. If we do that, we don't really need a MetricsFactory at all. We can directly inject a "handler metric" class that internally uses the IMeterFactory to then call specific methods on that class. All the related meters counters, histograms are then properly wrapped there and there are specific high-level methods that make it clearly visible of what is actually going on instead of wrapping it down into some meter abstraction.

Furthermore, by relying on the meter factory and that very thin and purposeful abstraction, we can also build in Enabled check that makes some of those method calls internally no-ops when there is no active listener.

Plenty of allocations for something that should be designed to only marginally interfere with the code on the hot path

By wrapping things and stuffing into the context bag we are creating lots of Gen0 garbage for something that was purposefully designed to be low allocation. Furthermore because of the generic nature of the seam we have lots of params overloads that create additional array allocations that we mostly are not using now. I see little reason to carry on TagList when we think about purposeful metrics for a specific context. I would rather want us to focus on a design that puts the necessary things into a tag list when something is needed so that it is visible where the metric is involved, and never have some arbitrary "ambient" context be magically added as cross-cutting concern to a counter. This gets rid of magic that might even backfire and also gets rid of unnecessary boxing.

mauroservienti commented 3 weeks ago

Closing for now. This type of generalization needs more thinking and is not required for the metrics we'll ship in 9.1