Aggregate metadata values

arkgil commented 5 years ago

Right now, according to metric spec, only the event values are "aggregatable" (i.e. the sum metric computes the sum of event values). However, in some cases, there might be many important values present in the event metadata.

One example @tsloughter gave me was a database query event, where the event value is the total time, but in event metadata, you could also include a time it took to wait for a connection, e.g.:

:telemetry.execute([:db, :query], query_time, %{pool_checkout_time: ...})

I imagine that we could add another parameter to metrics spec, i.e. :value. By default the value is an event value, but it could also be set to one of the metadata keys:

distribution("db.query", name: "db.pool.checkout_time", value: :pool_checkout_time)

By default the :metadata option would include the value key (in the same way it includes all the :tags).

Thoughts?

josevalim commented 5 years ago

@arkgil i would make the value a function and pass both the measurement and the metadata and let the user return whatever they want. :)

tsloughter commented 5 years ago

@josevalim the value in execute?

josevalim commented 5 years ago

Yes. --

José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D

arkgil commented 5 years ago

@josevalim oh, that makes total sense. Let's tackle it for the next release.

tsloughter commented 5 years ago

I don't understand. Value must be a number, right?

arkgil commented 5 years ago

@tsloughter I think @josevalim meant a :value option to given to metric spec. So in the example you gave me, to build a metric of pool_checkout_time you would do:

distribution([:db, :query],
              name: [:db, :query, :checkout_time],
              value: fn _event_value, metadata -> Map.fetch!(metadata, :pool_checkout_time) end
)

josevalim commented 5 years ago

@arkgil my only concern is how to implement this without forcing each reporter to duplicate the value handling logic (the same way with metadata and tags).

arkgil commented 5 years ago

@josevalim maybe for start we could provide some helper function, which given a metric spec, event value and metadata would return a value to report and a set of tags?

arkgil commented 5 years ago

Similarly to how Logger exposes a formatting function.

tsloughter commented 5 years ago

@arkgil ah, @josevalim said 'yes' when I asked if he meant value in execute which is what made no sense :), now I see.

But I'd suggest instead allowing a list for value:

execute([db, query], [{query_time, 3}, {decode_time, 1}, {queue_time, 1}], #{query => Query})

tsloughter commented 5 years ago

Forgot to mention, I suggest this in part because I don't think metadata should have metrics.

binaryseed commented 5 years ago

But I'd suggest instead allowing a list for value:

I'll chime in here to say that this idea ^^ makes a lot of sense to me. It's often the case that I wind up tracking multiple metrics based on a single point-in-time event

josevalim commented 5 years ago

I think the correct in this case would be to emit multiple events. In fact, now that I think about it, I am a -1 on this feature, as the best would be to re-emit the event. --

José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D

tsloughter commented 5 years ago

I'm not a fan of multiple events. This is something I planned to bring up on poller as well since I think it emits too many events.

An event, like the completion of a database query having multiple measurements associated with it makes sense and I think allowing the handling of them together does as well.

Obviously the implementation could emit the decode_time event right after decoding, and the pool_time right after checking out the connection, but I'd prefer, and think it better for efficiency purposes, to combine them into a single call.

josevalim commented 5 years ago

I am afraid it brings complexity to everyone along the way, as they now need to worry about a single measurement or multiple measurements in the same event.

Unless we change the API to always be a map?

What happens when you say you want to publish “db.query”? Do we always push all 3 sub measures?

I am really not worried about performance here... it is very fast and we may even make it faster with persistent_term. --

José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D

arkgil commented 5 years ago

I think that event means that something has happened like a database query has been completed. This event has a bunch of data associated with it, like query time, pool checkout time, etc. So maybe there is no single event value, but only a data map? For example, in systems like Honeycomb, you push an event with a huge payload containing all data about the current transaction (like HTTP request) and the state of your application, memory, CPU, etc.

But as @josevalim said, this would require telemetry API change again.

I'm not really a fan of a list of measurements, because then it is not clear what is a measurement and what is metadata.

binaryseed commented 5 years ago

From my experience, I agree that a single event ideally represents a real thing that happened (database query, http request, external http call). Along with that event, multiple measurements were taken.

I'll also say that it's very important to bring some clarity to the metadata blob, since there is meaningful structure to the data we are putting in it, ie: a clear distinction between measurement values, dimensions to aggregate upon, and extra info about the event.

This is what I'd suggest are the core pieces of the telemetry puzzle:

event The identity of the thing that happened
- ex: [db, query]
measurements Aspects of the event that were instrumented
- ex: query_time: 12, queue_time: 3
dimensions Data to aggregate upon; we'd create a metric for each combination
- ex: host: "foobar.net", env: "production"
info Extra data about the entire event
- ex: start_time: timestamp, sql_query: "SELECT", db_adapter: "Postgres"

josevalim commented 5 years ago

The thing is that dimensions and info can be totally user specific. Some users have a single host and do not care about the host. Other users may be using multiple db_adapters, and therefore they want to segregate this information. Imagine for example with Ecto MyXQL coming out, some users may use MyXQL in some machines, Mariaex in others, so they can directly compare the results.

This is why the metadata is a key-value and we let telemetry_metrics choose what is a dimension and what is an info.

However, I wholeheartedly agree that measurements should not be part of metadata.

I also really think we should have one event per measurement. The fact Ecto includes measurements in metadata is really Ecto's fault that we can hopefully address by 3.1. We could totally emit 3 separate events but, perhaps most importantly, there is a chance those measurements are not that useful anymore: we changed the queue algorithm and we spend less time doing decoding inside the query.

Are there other examples of events that would include multiple measurements?

tsloughter commented 5 years ago

Another example would be an HTTP request. When it is finished a few metrics are able to be calculated, and only at that time, the server latency, total sent bytes, total received bytes.

It can be done with multiple execute calls, but I'd still argue it is better to combine into a single event and single function call/ets lookup.

In some adapters we'll be sending a message to some collector for each execute, and not have a way to batch them if telemetry can't support this in some way. The ets locking is likely negligible because the table is almost nothing but reads, but nice to be able to "batch" (not really a batch since it is a single event still but multiple measurements) when grabbing the lock to write to the message queue.

binaryseed commented 5 years ago

So when I go to report data based on an event, I'm not exclusively reporting pre-aggregated metrics. I'm also going to send up the entire event - all the measurements, all the dimensions and all the info in a single data-structure (given some sampling rate).

Most monitoring SAAS solutions deal with both raw "Events" and pre-aggregated "Metrics". Raw events enable us to do the aggregation on the server side, which provides far more flexibility in querying - you don't have to differentiate between dimensions and info up front, you just send the whole thing.

We also still send lots of metrics because they are far smaller, and they can be held onto for a much longer time (months, years) while events have very short retention periods (days, weeks).

Now in telemetry, if each individual measurement is send separately, each reporter that wants the raw events has to invent a way to tie them back together, which is very complex, stateful and error (and memory leak) prone.

josevalim commented 5 years ago

you don't have to differentiate between dimensions and info up front, you just send the whole thing.

This will definitely not work in projects like Phoenix, which may include the whole connection in the metadata and you definitely do not want to send the whole connection. The same happens in Ecto, which includes the whole resultset in the metadata. So I think any assumption that telemetry events can be sent as is to the reporter is ultimately flawed.

If there is some complicated processing between the telemetry event => the reporter event, then I would first try to explore how to express this processing on the telemetry_metrics side of things.

This is a very important discussion to have because it is part of how we will tell library authors to hook up telemetry.

josevalim commented 5 years ago

Sorry, premature send.

In any case, I understand the point that breaking Ecto apart will push complexity to the reporters in grouping those measurements back together.

When it is finished a few metrics are able to be calculated, and only at that time, the server latency, total sent bytes, total received bytes. It can be done with multiple execute calls, but I'd still argue it is better to combine into a single event and single function call/ets lookup.

I agree. I would reply though that in this case the measurement is the latency and I would put the remaining in the metadata. However, that directly goes against not having measurements in the metadata. So we are back to the initial point which is: we do have an issue and we have to address it.

Right now we have two options:

Make measurement a map in telemetry "core"
Pass any extra measurement in metadata and solve it at telemetry_metrics

I think the best way to answer this is to consider how everything will wire up together instead of looking at 1 or 2 individually. For example, if the http response is a map of measurements, each with a different unit, how would the respective telemetry_metrics be defined for those three measurements? Are those separate metrics? Is that a single metric that always sends all of it?

arkgil commented 5 years ago

that directly goes against not having measurements in the metadata.

I'm still not sure why having measurements in "metadata" is a bad thing? If that would be the case, then metadata + measurements become just event data and it carries the whole context of the event.

if the http response is a map of measurements, each with a different unit, how would the respective telemetry_metrics be defined for those three measurements?

I'd say that with telemetry_metrics you need to pick one measurement because we want to run aggregations on those measurements. For other use cases, one can simply not use telemetry_metrics (e.g. in InfluxDB you can have multiple measurements per datapoint so it makes sense to push all of them at once). I think it's fine for telemetry_metrics to be slightly opinionated about how things should be done, but telemetry "core" should be possible to use for different use cases.

josevalim commented 5 years ago

I'd say that with telemetry_metrics you need to pick one measurement because we want to run aggregations on those measurements.

The problem is that telemetry_metrics is designed based on how events are emitted. If developers are emitting events with important measurements in metadata that can't be properly processed by telemetry_metrics, then we will have to catch up.

Therefore, if our answer is "there is nothing wrong with measurements in metadata", then we most likely need a way to extract values from metadata and publish those too.

tsloughter commented 5 years ago

The issue is the mapping of events to aggregates.

I think the aggregates should be on measurements found in the event, except maybe in the case of count.

arkgil commented 5 years ago

Therefore, if our answer is "there is nothing wrong with measurements in metadata", then we most likely need a way to extract values from metadata and publish those too.

Yeah, when I said "pick one measurement" I meant some value from event metadata or event value, if the telemetry API stays as it is now. So basically what has been proposed at the beginning of this discussion:

distribution([:db, :query],
              name: [:db, :query, :checkout_time],
              value: fn _event_value, metadata -> Map.fetch!(metadata, :pool_checkout_time) end
)

josevalim commented 5 years ago

@tsloughter exactly. The annoying thing about the current API if we end-up having measurements both as an argument and in the metadata is that we effectively need two ways to extract it out. If we always make it a map, then it is a little bit more structured but I have other concerns.

Another option is to get rid of the measurement and make the event always be the event name + metadata and you always have zero, one or multiple measurements in the metadata.

@arkgil I don't like that compared to emitting a map. If Ecto did this:

:telemetry.execute([:db, :query], %{total_time: ..., decode_time: ..., queue_time: ...}, metadata)

Then we could do this:

distribution("db.query.total_time"),
distribution("db.query.decode_time"),
distribution("db.query.queue_time"),

And telemetry_metrics knows we always need to register to "db.query" and access the total_time key and so on. So for each event, you may have multiple measurements to publish, but we can attach and publish only once, which is more efficient too.

What do you think?

arkgil commented 5 years ago

@josevalim oh, I like it! My only concern is why differentiate between metadata and measurements? What makes some piece of data metadata and what makes it a measurement?

binaryseed commented 5 years ago

So I think any assumption that telemetry events can be sent as is to the reporter is ultimately flawed.

Sorry I wasn't clear enough, I'd never send the entire raw telemetryevent, I'd always extract known values from that to generate a raw un-aggregated New Relic event...

For example, NR has standardized attribute names for parts of a http "Transaction" like path, method, duration, queue time, status code, etc. That's the "raw event" I'd generate and report up

josevalim commented 5 years ago

@josevalim oh, I like it! My only concern is why differentiate between metadata and measurements? What makes some piece of data metadata and what makes it a measurement?

I don't know. Anyone can justify having execute(event_name, map_of_measurements, data) or execute(event_nam, data)?

tsloughter commented 5 years ago

@josevalim oh, I like it! My only concern is why differentiate between metadata and measurements? What makes some piece of data metadata and what makes it a measurement?

Measurements can be aggregated, metadata should be stuff like tags and labels (still wish there was desire for a context library that those could be extracted from automatically, possibly from the pdict, but that is a another topic :)

binaryseed commented 5 years ago

@josevalim That last example looks really great.

@arkgil I think the distinction has to do with how something is involved in the process of aggregation. And I think there's 3 things not 2: measurements, dimensions, info

measurement: Aggregate the value (ex: sum('response_time'), distribution('queue_time'))
- You want to run aggregation logic on the value. (ex: sum('response_time'))
dimension: Group by the value (ex: status_code, controller_name)
- You wouldn't aggregate status_code (ie: sum it up). However, you might want to use status_code as a dimension (track response time split up by status code)
info: Decorate the measurement (ex: request_path, sql_query)
- When you aggregate, this might drop away, but if you are sending raw events, this could be very valuable to attach to the event

Having these split out as part of the API means the reporter could figure out exactly what to do w/o having to have explicit domain knowledge of each measurement type - it'd be the responsibility of the library that knows the domain to provide semantics to the information (not the reporter or the end user).

arkgil commented 5 years ago

@binaryseed I agree that this distinction makes sense for the reporter. However, I'm not sure if it's suitable for the event itself. Event doesn't "know" what happens with it later, it just carries some data. And it's up to reporter (and user configuring it) to choose which part of data is measurements, dimension and info.

tsloughter commented 5 years ago

I don't know that the telemetry api needs to know about the split between dimension and info.

A metric should defining what dimensions it wants and then the metadata is filtered down to just those.

Also, it is very useful to have clear terms when discussing these things, likely telemetry and telemetry metric should have a dictionary of terms. For example measurement in opencensus world is not the aggregate value, it is a measure+value https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/Record.md

arkgil commented 5 years ago

@tsloughter right, but I think we're figuring out what these terms are right now 😄

tsloughter commented 5 years ago

@arkgil hehe, right :), just suggesting it ultimately be documented in the projects and prior art.

josevalim commented 5 years ago

I don't know that the telemetry api needs to know about the split between dimension and info.

To be clear, you are saying the API should be telemetry:execute(EventName, Data). Am I correct? That seems to be the consensus because what is a "measurement/dimension" vs "info" is really not up for the dispatcher to decide.

binaryseed commented 5 years ago

I think the least we should do is provide a mechanism in the system as a whole to tell the reporter which attributes to aggregate (sum, etc), and which attributes to group by (dimensions / tags in opencensus). As long as we can communicate those things clearly, we'll be ok.

Perhaps all of that lives inside the telemetry_metric, that'd make sense.

And a big ➕ on clear definitions & leaning on existing industry terminology

tsloughter commented 5 years ago

@josevalim na, I meant it would still be execute/3, I just mean that the pairs in metadata don't have to be split between dimension and info when thinking about telemetry API, while telemetry metrics will want to define dimensions (tags/labels).

josevalim commented 5 years ago

@tsloughter so to be clear, which API do you propose (in typespecs)?

A. execute([atom()], non_neg_integer(), map()) B. execute([atom()], map(), map()) C. execute([atom()], map())

Or other?

tsloughter commented 5 years ago

execute([atom()], number() | map(), map())

josevalim commented 5 years ago

@tsloughter to me that's the worst of both words because now everyone consuming the event needs to know if it is an integer or a map, especially tools that work on the events generally. So if we want to go with a map, we either go all in or we don't. :)

arkgil commented 5 years ago

I have the same concerns as @josevalim here. FWIW, option C appeals to me the most. But I'm concerned about backwards compatibility, since some people already rely on the existing API. Unless we provide a backward-compatible way of emitting a single map.

josevalim commented 5 years ago

@arkgil I prefer C the most too.

For backwards compatibility, we can probably have this:

execute/3 will put the given value as :value in the metadata and call execute/2. When attaching, if the callback has 4 arguments (i.e. the old API), we can rewrite it to 3 arguments like this:

fn event_name, metadata, config -> user_callback.(event_name, Map.get(metadata, :value, 0), metadata, config) end

This should keep everything working. We can start by doing those changes and then verifying the whole suite pass. Then we deprecate them and add the new APIs.

Thoughts?

arkgil commented 5 years ago

@josevalim this looks great!

tsloughter commented 5 years ago

Hehe, C was my least favorite. I suppose it isn't a big deal to have to merge measurements and tags to pass to execute, but it definitely doesn't "feel right".

josevalim commented 5 years ago

Yeah, my biggest concern with segregating them is that what is a dimension/measurement for some is info for others. :( So better to group everything and let the user decide. --

José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D

tsloughter commented 5 years ago

Yea, it just hurts creating generic adapter functions. If need be a user can always drop or move around values from the measurements and tags, this solution forces the user to do just that but in the default case instead of as an exception.

arkgil commented 5 years ago

@tsloughter I'm not sure I understand your comment, could you elaborate?

bryannaegele commented 5 years ago

We haven't spiked out any adoption of telemetry_metrics just yet (following along closely and trying to see how it may fit in for us when we transition to Prometheus for reporting), but since this turned to a question of changing core, I would vote for C.

The pattern of emitting an event with all relevant data and measures the executer feels is relevant and allowing the handler to pick what is important to them - versus picking a single metric as the value and everything else is "meta" - seems to provide the most power. The listener that cares about a particular measurement can pick that out, while a library doing APM may wish to take advantage of having all the measures in one context (the event) to construct spans.

To your point @josevalim - the time diff of an operation may be important to one person, while the exact start and stop times may be important to another. Providing a sensible default measure as the value in the map still leaves room for being opinionated and provides a generic API for libraries, like @binaryseed's use case.

This may be a bad pattern to adopt, but I've found having all of the measurements and metadata in the one map emitted by Ecto to be helpful. I'm able to batch report each of the main measurements that we care about in the handler, but also separately log important metadata for slow queries as an event, all at once. I'm sure we'll evolve this, but it was very nice and clean to do this with Telemetry versus additional dependencies.

Love the work done thus far and I'm hoping we can find ways to contribute!

beam-telemetry / telemetry_metrics

Aggregate metadata values #21