Open jonatan-ivanov opened 1 year ago
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
The specification requires that the unit be handled as follows:
The Unit of an OTLP metric point SHOULD be converted to the equivalent unit in Prometheus when possible. This includes:
- Converting from abbreviations to full words (e.g. "ms" to "milliseconds").
- Dropping the portions of the Unit within brackets (e.g. {packets}). Brackets MUST NOT be included in the resulting unit. A "count of foo" is considered unitless in Prometheus.
- Special case: Converting "1" to "ratio".
- Converting "foo/bar" to "foo_per_bar".
The resulting unit SHOULD be added to the metric as OpenMetrics UNIT metadata and as a suffix to the metric name unless the metric name already contains the unit, or the unit MUST be omitted. The unit suffix comes before any type-specific suffixes.
That does not include changing the unit to a different unit or modifying the value in any way.
Since the Prometheus exporter is the concern of the collector, I think the client should never know that the data that it published in OTLP format will be converted to Prometheus format. Because of this, I think any unit that is supported by OTLP should work and the client should not care. Maybe the Prometheus exporter is not configured right now but it will be starting from tomorrow. I think making a change on the exporters should not involve changing all the clients.
Can this behavior lead to impossible scenarios?
The resulting unit SHOULD be added to the metric as OpenMetrics UNIT metadata and as a suffix to the metric name unless the metric name already contains the unit
The bold part is not happening. The included actual result above shows the metric name from the Prometheus exporter is test_timer_sum
- no unit in the name. A consumer of the exporter in Prometheus format has no way to know what the unit is. I can't speak to whether the former part is happening or not because I was never able to get the Prometheus exporter to return OpenMetrics format, even when setting enable_open_metrics: true
. Regardless, the UNIT metadata is not part of Prometheus format, so it wouldn't help consumers of the Prometheus exporter that are scraping Prometheus format rather than OpenMetrics format.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@Aneurysm9 Could you please check the last two comments and mark this issue so that it won't be auto-closed?
The bold part is not happening. The included actual result above shows the metric name from the Prometheus exporter is test_timer_sum - no unit in the name.
This is now happening in the latest releases (since https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/20519).
Regarding converting milliseconds
to seconds
, while this is possible in fixed-bucket histograms, it is not possible to do in exponential histograms. This was one of the main motivations to adopt seconds
as the default unit for HTTP (and hopefully other) duration measurements in OTel Semantic Conventions. Ideally the producer will send seconds (as defined in the semantic conventions).
I don't think converting milliseconds
to seconds
is appropriate in fixed-bucket histograms while not converting in exponential histograms.
This is now happening in the latest releases (since https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/20519).
Yes, we noticed in Micrometer when it broke our integration tests: https://github.com/micrometer-metrics/micrometer/issues/3796.
This was one of the main motivations to adopt seconds as the default unit for HTTP (and hopefully other) duration measurements in OTel Semantic Conventions. Ideally the producer will send seconds (as defined in the semantic conventions).
I think this is tying things together that shouldn't be tied together. OTLP is a format for telemetry data; it defines the data model but not the semantic naming. Someone should not have to use the OTel semantic convention to successfully use OTLP or the OTel Collector. I understand all of these things are branded OpenTelemetry, but it would behoove adoption and usefulness to users if they could be used separately. And it was my understanding they were intended to be usable without using everything.
It hurts the Collector's general usefulness if the Prometheus exporter expects the input is already in seconds so it matches data produced specifically for Prometheus/OpenMetrics. If the producer is a Prometheus client, it's clear what conventions it should follow as far as unit, but not all producers know where data they are producing will be stored, especially if it is in OTLP format (and sent to the Collector) that is supported by different backends.
Regarding converting milliseconds to seconds, while this is possible in fixed-bucket histograms, it is not possible to do in exponential histograms.
That's unfortunate and I don't have any solution. It feels like it leaves us in this bad state where the Collector can't deliver its full potential of being a universal adapter. Users are going to have to make more breaking changes to align with its limitations.
Fyi: it seems that starting from 0.80.0
the unit was removed (brakes our integration tests again): https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/23229
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@Aneurysm9 Could you please add the never stale
label on the issue so that I don't need to play ping-pong with the bot?
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@Aneurysm9 or someone else: Could you please add the never stale label on the issue so that I don't need to play ping-pong with the bot?
Component(s)
exporter/prometheus
What happened?
Description
Prometheus uses seconds as time unit by default. If I send an OTLP histogram with a different time unit, the value will not be converted to seconds (as it should be) but will be used as-is.
Steps to Reproduce
Send a histogram with unit: "milliseconds" to the OTel collector where the receiver is
otlp/http/protobuf
(but I think any otlp receiver should produce the same result) and the exporter is prometheus. Then check the Prometheus/metrics
endpoint. E.g.:Expected Result
Actual Result
Collector version
otel/opentelemetry-collector-contrib:cdf47846a7ff
Environment information
Environment
OS: MacOS 13.2.1
OpenTelemetry Collector configuration
Log output
No response
Additional context
No response