input/otlp: Group Scope Metrics By Attributes Only ( Instead of timestamp + attrs )

Hi,

APM server seems to break Scope Metrics into different documents (as seen on Kibana) if metrics are instrumented via SDK, instead of all metrics from the same scope in one document, due to metrics having different timestamp. Not sure if this is a feature or an issue

Issue Statement

We're instrumenting via Opentelemetry-go SDK and have different types of metrics organized around scopes (i.e., different Meter). Then all metrics are pipelined through OTeL collector, then onto APM server.

Below is the snippets of the metrics from one Scope, as output by Otel collector. 5 metrics under scope1 with the same attributes but different timestamp

ResourceMetrics #1
Resource SchemaURL:
Resource attributes:
     -> collector: Str(xxxx)
     -> env: Str(TEST)
     -> host: Str(xxxx)
     -> service.name: Str(xxx)
     -> telemetry.sdk.language: Str(go)
     -> host.name: Str(otelcol-deployment-collector-6475d6969b-kt28r)
     -> os.type: Str(linux)
     -> k8s.cluster.name: Str(xxxxx)
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope scope1
Metric #0
Descriptor:
     -> Name: metric0
     -> Description:
     -> Unit:
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> attribute1: Str(attr1)
     -> attribute2: Str(attr1)
StartTimestamp: 2024-09-23 06:01:13.822721145 +0000 UTC
Timestamp: 2024-09-23 06:25:08.022659417 +0000 UTC
Value: 101320
Metric #1
Descriptor:
     -> Name: metric1
     -> Description:
     -> Unit:
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> attribute1: Str(attr1)
     -> attribute2: Str(attr1)
StartTimestamp: 2024-09-23 06:01:13.822745402 +0000 UTC
Timestamp: 2024-09-23 06:25:08.022674989 +0000 UTC
Value: 101556
Metric #2
Descriptor:
     -> Name: metric2
     -> Description:
     -> Unit:
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> attribute1: Str(attr1)
     -> attribute2: Str(attr1)
StartTimestamp: 2024-09-23 06:01:13.822749919 +0000 UTC
Timestamp: 2024-09-23 06:25:08.022675672 +0000 UTC
Value: 3584
Metric #3
Descriptor:
     -> Name: metric3
     -> Description:
     -> Unit:
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> attribute1: Str(attr1)
     -> attribute2: Str(attr1)
StartTimestamp: 2024-09-23 06:01:13.822753575 +0000 UTC
Timestamp: 2024-09-23 06:25:08.022676329 +0000 UTC
Value: 3586
Metric #4
Descriptor:
     -> Name: metric4
     -> Description:
     -> Unit:
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> attribute1: Str(attr1)
     -> attribute2: Str(attr1)
StartTimestamp: 2024-09-23 06:01:13.82275741 +0000 UTC
Timestamp: 2024-09-23 06:25:08.022676819 +0000 UTC
Value: 1
Metric #5
Descriptor:
     -> Name: metric5
     -> Description:
     -> Unit:
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> attribute1: Str(attr1)
     -> attribute2: Str(attr1)
StartTimestamp: 2024-09-23 06:01:13.822796019 +0000 UTC
Timestamp: 2024-09-23 06:25:08.022677264 +0000 UTC
Value: 2

Current result seen on Kibana: 5 documents with each containing one metric

collector: Str(xxxx)
env: Str(TEST)
host: Str(xxxx)
service.name: Str(xxx)
telemetry.sdk.language: Str(go)
host.name: Str(otelcol-deployment-collector-6475d6969b-kt28r)
os.type: Str(linux)
k8s.cluster.name: Str(xxxxx)
metric1: value1

Expected result seen on Kibana: one document with 5 metrics

collector: Str(xxxx)
env: Str(TEST)
host: Str(xxxx)
service.name: Str(xxx)
telemetry.sdk.language: Str(go)
host.name: Str(otelcol-deployment-collector-6475d6969b-kt28r)
os.type: Str(linux)
k8s.cluster.name: Str(xxxxx)
metric0: value0
metric1: value1
metric2: value2
metric3: value3
metric4: value4

Possible cause

The group key is the combination of timestamp and attributes on Datapoint as seen here key := metricsetKey{timestamp: timestamp, signature: signatureBuilder.String()}

So if the metrics come in with different timestamp, they will end up in different metricset.

Version

8.15

Proposal

It makes sense that all metrics from one scope be seen in one document. This makes it easier to later use them for alerting and dashboard. Without knowing how data is stored in Elastic, I feel like this may save some storage as well

So it seems reasonable to only use attributes on Datapoint as group key since metrics will certainly have different timestamp if generated by SDK. To my knowledge, there is no way to set timestamp while instrumenting.

From:

type metricsetKey struct {
    timestamp time.Time
    signature string // combination of all attributes
}

To:

type metricsetKey struct {
    signature string // combination of all attributes
}

@anakineo your expected result document does not include a timestamp. These are time series, so naturally they need to have a timestamp. What timestamp would you expect to be recorded on the resulting document if we did not include that in the grouping key? (Bear in mind that data may arrive late, so the time at which it is received may be much different to the time it was exported.)

It makes sense that all metrics from one scope be seen in one document. This makes it easier to later use them for alerting and dashboard.

Would you be able to elaborate on this a little bit? How does having them in the same document make them easier to use for alerting and dashboarding?

I'm asking because there are long-running discussions about the issues related to storing a single metric per document (e.g. storage overhead), and grouping (which can introduce other issues): https://github.com/elastic/elasticsearch/issues/91775

@axw Thanks for getting back

These are time series, so naturally they need to have a timestamp. What timestamp would you expect to be recorded on the resulting document if we did not include that in the grouping key

You're right. I was wrong to suggest to remove timestamp and what I wanted to convey was to set timestamp of the metrics from one scope to a single timestamp, so that all metrics will be grouped and sent as one document.

Our case is we have different scopes each containing from few to few dozen metrics, within each scope, all the metrics are recorded with the same ms, as indicated by my sample Otel output. So it makes sense to have them grouped into one doc as long as they are recored within some reasonable short time, say 1s, we can consider their timestamps being the same. It's like making a vertical wave straight, given the diff between peaks is small enough.

I feel like this might be best done at the Otel collector side or configurable at APM side.

Would you be able to elaborate on this a little bit? How does having them in the same document make them easier to use for alerting and dashboarding?

For dashboarding, we often find the need to consolidate different documents into one view and do calculations based on metrics spanning few documents.

The case with alerting. Say we want give whoever receives and alert a complete context around what's happened (they can be in the form of labels, but not all are labels), it's much easier if the related metrics are in one doc. Right now we need to join some docs to get the complete info.

@anakineo thanks for the additional info!

Our case is we have different scopes each containing from few to few dozen metrics, within each scope, all the metrics are recorded with the same ms, as indicated by my sample Otel output. So it makes sense to have them grouped into one doc as long as they are recored within some reasonable short time, say 1s, we can consider their timestamps being the same. It's like making a vertical wave straight, given the diff between peaks is small enough.

Got it, thanks. I wonder if the OTel metrics API should provide a means of specifying the timestamp when recording a data point. I think that would make sense for example when you take a snapshot of various aspects of your system, and then record each one as a separate metric. The timestamp would be the time at which you took the snapshot.

I feel like this might be best done at the Otel collector side or configurable at APM side.

Without some kind of change to the OTel API, I think this will need to be handled with an OTel Collector processor before the data gets to APM Server. I can think of a couple of hypothetical options:

use a processor (transform processor?) to truncate/round metric timestamps to seconds granularity
use the interval processor to accumulate and emit metrics aligned with some time window

For dashboarding, we often find the need to consolidate different documents into one view and do calculations based on metrics spanning few documents.

So you're calculating derivative metrics at query/aggregation time? Do you have a concrete example of that which you're able to share?

The case with alerting. Say we want give whoever receives and alert a complete context around what's happened (they can be in the form of labels, but not all are labels), it's much easier if the related metrics are in one doc. Right now we need to join some docs to get the complete info.

I haven't thought about this deeply, but it feels like a shortcoming of alerts if you're forced to group things in the same doc to get context in the alert. I mean, that context gathering/correlation could probably also be deferred to when an alert occurs. But anyway, understood.

use a processor (transform processor?) to truncate/round metric timestamps to seconds granularity

use the interval processor to accumulate and emit metrics aligned with some time window

Thanks for the pointers! I did look at the transform processor and it can do many things, including setting timestamp for each datapoint at the time of processing. I was particularly interested in using the timestamp in the datapoint instead of the time of processing which can be deployed by a lot if, for example, collector is down. Rounding to the second sounds promising!

I knew about the interval processor but thought it wasn't the processor I'm looking for

So you're calculating derivative metrics at query/aggregation time? Do you have a concrete example of that which you're able to share?

It's about correlating. Suppose we have documents containing 20 metics plus some labels. We will use some to build a line graph that shows the result of metric#1/metric#2 ( like a rate ), in the meanwhile, we want to put some references to this line using other metrics from the same documents, such as resource usages, disk I/O and other ( we know from our experiences that are or used to be the cause of the issue ) related system metrics. The goal is if we receive some alerts, we can also tell from the line graph if some other things are off.

This might be XY problem that has other solutions but it's difficult to do if the metrics are scattered in different document.

The same goes for alerting, IMO, it's best to have everything ( that we know it's relevant to this particular case) pre-populated in alerts to get quicker troubleshooting (say the rate reaching the threshold fires an alert, and there are resource usage metrics in the alert as well ). Agree we can gather the context when an alert fires but that goes back to the question about how we easy we can get the data if they are in different documents if we don't' want to do lots of joins. Having related metrics in one document makes it a bit easier

@anakineo this is great, thanks. I will bring this information to the wider team here and see if anyone has some additional suggestions. If nothing else, it'll be useful for us to consider while building the solution. In the mean time I'd still recommend pre-processing with the collector.

This might be XY problem that has other solutions but it's difficult to do if the metrics are scattered in different document.

If it turns out to be an XY problem, we certainly need to document how to do these sorts of things. It doesn't sound like a very unusual requirement.

elastic / apm-data