Is your feature request related to a problem? Please describe.
In OpenTelemetry, every metric of cumulative type is required to have a StartTimestamp, which records when a metric is first recorded. However, Prometheus exporter doesn't provide StartTimestamp by default. The absence of StartTimestamp leads to inaccuracies that distort the telemetry data.
Example 01: When new StartTimestamp is less than actual StartTimestamp, it creates "fake tail" in metric values
Example 02: When new StartTimestamp is more than actual StartTimestamp, it creates "fake peak" in metric values
As you can see from the examples above, incorrect choice of StartTimestamp on OpenTelemetry side always leads to distortions in collected metrics data. And without StartTimestamp on Skipper side there is no chance to correctly guess StartTimestamp on telemetry collector side.
Describe the solution you would like
To remove these side effects in telemetry data, we need StartTimestamp to be set for each cumulative metric by Skipper. The value of StartTimestamp for a particular metric must be equal to the UNIX timestamp (in nanoseconds) of when this metric, with the exact same combination of attributes, was first seen by Skipper.
Describe alternatives you've considered (optional)
In our setup we also tested out start timestamp normalization algorithm, introduced by OpenTelemetry in this article. It describes a set of transformations to the cumulative sum metrics that helps handling unknown start timestamp. This approach can be used as a workaround in situations, where getting start timestamp of stream is not possible at all. But like any other hacks, this one has its own disadvantages. During collector restarts metric data will be completely lost:
This is a critical problem, since usually one OpenTelemetry collector is used per cluster / installation, so restart will cause loss of data. This can be definetly avoided in case we have StartTimestamp for each cumulative metric on Skipper side.
Is your feature request related to a problem? Please describe. In OpenTelemetry, every metric of cumulative type is required to have a
StartTimestamp
, which records when a metric is first recorded. However, Prometheus exporter doesn't provideStartTimestamp
by default. The absence ofStartTimestamp
leads to inaccuracies that distort the telemetry data.Example 01: When new
StartTimestamp
is less than actualStartTimestamp
, it creates "fake tail" in metric valuesExample 02: When new
StartTimestamp
is more than actualStartTimestamp
, it creates "fake peak" in metric valuesAs you can see from the examples above, incorrect choice of
StartTimestamp
on OpenTelemetry side always leads to distortions in collected metrics data. And withoutStartTimestamp
on Skipper side there is no chance to correctly guessStartTimestamp
on telemetry collector side.Describe the solution you would like To remove these side effects in telemetry data, we need StartTimestamp to be set for each cumulative metric by Skipper. The value of StartTimestamp for a particular metric must be equal to the UNIX timestamp (in nanoseconds) of when this metric, with the exact same combination of attributes, was first seen by Skipper.
Describe alternatives you've considered (optional) In our setup we also tested out start timestamp normalization algorithm, introduced by OpenTelemetry in this article. It describes a set of transformations to the cumulative sum metrics that helps handling unknown start timestamp. This approach can be used as a workaround in situations, where getting start timestamp of stream is not possible at all. But like any other hacks, this one has its own disadvantages. During collector restarts metric data will be completely lost:
This is a critical problem, since usually one OpenTelemetry collector is used per cluster / installation, so restart will cause loss of data. This can be definetly avoided in case we have
StartTimestamp
for each cumulative metric on Skipper side.Additional context (optional) Prometheus Data Model: https://prometheus.io/docs/concepts/metric_types/ OpenTelemetry Data Model: https://opentelemetry.io/docs/specs/otel/metrics/data-model/
Would you like to work on it? Yes