Open ofek opened 2 years ago
The issue had no activity for 30 days, mark with Stale label.
Bump.
It's the doc's issue
No, this is broken
The issue had no activity for 30 days, mark with Stale label.
bump
The issue had no activity for 30 days, mark with Stale label.
The issue had no activity for 30 days, mark with Stale label.
Hi @ofek, you are absolutely correct. I've been working over the last several months documenting the current state of metrics and releasing the document to the community just 2-3 weeks ago. As you can see in there, it's a known issue.
This document is part of a large effort to refactor how the metrics are defined, used, and exported in Pulsar.
@codelipenghui @merlimat - we potentially don't have to wait for the full refactor, but provide a fix just for exporting histograms - it's not a small fix, but it's not a complicated fix. the biggest issue is once we do that of course, we break compatibility, so this must be done gradually with flags (oldHistogram=true, newHistogram=false
). WDYT?
@ofek I forgot to explain there is another issue you haven't mentioned: histogram bucket values today are delta-resets, meaning most of them are reset every configurable interval (30sec/1min). Prometheus quantile function assumes the values are incremental counters. This is another thing that needs to be fixed. This as well breaks backward compatibility of course.
Describe the bug
As documented in the official spec and mentioned in Pulsar's docs, histogram buckets are suffixed by
_bucket
with an upper boundle
label.Instead, the label and value is embedded in the metric name as a suffix:
To Reproduce
Steps to reproduce the behavior:
Desktop (please complete the following information):