beam-telemetry / telemetry_metrics_prometheus_core

Core Prometheus Telemetry.Metrics Reporter package for telemetry_metrics_prometheus
Apache License 2.0
35 stars 30 forks source link

Summary metric #11

Open wojtekmach opened 4 years ago

wojtekmach commented 4 years ago

I noticed that summary is not yet supported.

bryannaegele commented 4 years ago

Indeed. I started working on it in October but events at work prevented me from finishing. This is at the top of my list.

bodgix commented 4 years ago

Hello @bryannaegele how are you getting on with this feature? I would like to help if you need a hand.

I started studying the code and think that the Aggregator module would need to be updated to support summaries or perhaps protocols could be used to add polymorphism to the aggregator.

In case if you could use a hand, do you have any notes or a working branch with any ideas or plan you have for this feature that could help someone get upto speed and prevent from going in the direction which you wouldn't like?

bryannaegele commented 4 years ago

Hey @bodgix. I have an idea I was working on locally but haven't pushed it up yet. The biggest holdup was it will require a large bump in the minimum supported version of Elixir to do it efficiently as the implementation relies on atomics which are OTP 21.3+. There was simply no way I could find to efficiently accomplish supporting pValues at scale without relying on atomics or resorting to approximations such as the technique outlined in https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/quantiles.pdf.

I think it's safe at this point to increase the required version, so I'll get it in a reviewable state and pushed up for comment and testing.

cohawk commented 4 years ago

Thanks Bryan - Id rather have a working version on a recent release for usability FYI. Then backpedal on backwards compatibility from there. Just IMHO

davydog187 commented 4 years ago

@bryannaegele I'm curious what the minimum supported version this project is targeting. OTP 21.3 is over a year old at this point http://erlang.org/download/otp_src_21.3.readme

I see that the current supported version is Elixir 1.6 which just misses this release of OTP 21.3.. When does the telemetry project decide to deprecate support for older versions of Elixir?

bryannaegele commented 4 years ago

The main pieces support back to OTP 19. With the release of 23, that will go to 20 (4 versions). The goal of all the major core libraries in the Elixir ecosystem is to support 4 major versions (OTP and Elixir). Supporting more Elixir versions is less difficult in some respects than OTP versions.

With the reporters, especially this one, I'm less concerned with still supporting OTP 20 if we want to support this feature.

A question for the crowd that is related - to support this as efficiently as possible, I need to use atomics and that comes with the additional restriction that only integer measurements would be supported. That would fit the vast majority of use cases, but it would be a conscious limitation. Otherwise, we're stuck with ETS and the lower performance + higher memory demands. Is anybody opposed to only supporting integer values?

And a progress update - work on this has started but it's taking a while since handling of measurements, aggregating, and storage of the aggregations is quite different than any of the other metric types.

davydog187 commented 4 years ago

All of the summary metrics I can think of wanting would be based on native units. Admittedly, I haven't thought beyond that too deeply.

ghost commented 4 years ago

What would be the outstanding work to be done to have this implemented? My understanding is that the current proposed limitation of integer values is not a problem, because summaries are in most (if not all) cases around numeric values.

haljin commented 3 years ago

@bryannaegele How is this progressing? Do you need any help?

bryannaegele commented 3 years ago

Hey all. I have been hesitant to implement this directly in this library without finding an efficient way to accomplish it and it will likely involve a great deal of work. The limited time I have available is focused on OpenTelemetry and other projects, so I open this up to the community. There are a number of challenges that must be addressed. I'll lay out the problem-space as I understand it and potential solutions. I'll aim to be exhaustive so that y'all can correct me on my misunderstandings 😄

Prometheus Histograms vs Summaries

The manner in which the aggregator addresses Histograms works due to the commutative properties of them. We can aggregate the newly observed values and add them to the previous summary, removing any need to keep the samples around. We know the buckets and tags ahead of time, so this whole process is relatively efficient.

The base requirement for a Summary in Prometheus is an aggregation of sum and count for each time series. This is relatively straightforward to implement with the existing code since you could use the implementation for counts and sums under the hood. The trick is you'd then need to store the aggregations in time buckets and use the prior aggregations based on the time windows provided by the user, e.g. store aggregations for the sets and, given a 1 minute window, add the prior aggregations which fall into that window.

On top of that, we'd need a janitor process to clean out aggregates which have expired beyond all specified time periods. It is a common practice for users to specify more than one time window for observations, i.e. 1-minute and 5-minute p99s.

All of this is what I would aim for in a base implementation to satisfy Prometheus' requirements and something that could be aimed for in an initial implementation, but likely does not satisfy what users want - pValues.

Data and Duplication Across Tags and Time

Each metric + label + label_value set is a distinct time series requiring a separate calculation. I won't g into depth on accurately calculate a pValue (there's plenty of literature on that but, in general, we have to sort all of the values and determine the value within the specified bucket, i.e. p99 requires 100 buckets, p99.9 requires 1000, etc.

Since these are sliding windows and both the number of samples and their values can skew heavily within any given part of the window, you must keep all samples around and merge sort them each time you want to calculate it.

There are definitely some tricks around this to make it somewhat more efficient, but you can likely see how this becomes a very intensive set of operations at scale in the BEAM with a strict and naive implementation and a tag or two involved. Assuming a service with 100,000 RPS, a 5-minute window comes to 30,000,000 samples for a single metric with no tags.

Potential Solution

The basic design I originally landed on to do this efficiently with a strictly naive implementation likely involves heavy use of atomics but it can probably be done with ets sets tables. Additionally, each time window should probably require a separate metric definition to ease implementation.

I do not have any solution for how to make any of this work with the current aggregate mechanisms and exporter. I took a stab at this a few times and there's no clear path that I could find. Hopefully somebody else can.

Overall, I'm not convinced that the amount of work and change required is worth it. My inclination is this would be better handled through opentelemetry. :/

dantswain commented 3 years ago

Thanks for the update @bryannaegele !

I have a couple ideas/questions:

bryannaegele commented 3 years ago

I think a generic approach is fine with caveats as to what users should look for around performance. If you have a design in mind, I'm happy to give input before undertaking it.

For the Otel bit, this library isn't going away and they're different considerations for what someone's needs are and there's no clear migration path for Otel metrics anytime soon, so don't worry about this library going away. :)