apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.24k stars 3.59k forks source link

Add pulsar_tenant and pulsar_namespace labels to prometheus metrics #19554

Open pgier opened 1 year ago

pgier commented 1 year ago

Search before asking

Motivation

It can be useful in to combine metrics by the pulsar tenant or namespace. Currently this is difficult to do by tenant because there is no distinct tenant label. There is only the namespace label which has the format <pulsar-tenant>/<pulsar-namespace>.

Solution

Add two new labels to prometheus metrics: pulsar_tenant and pulsar_namespace. This would allow metrics to easily be grouped by tenant, and the pulsar_namespace label would avoid confusion with the kubernetes namespace. We should also keep the current namespace label for compatibility, but consider deprecating it and eventually remove it in some future release.

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

github-actions[bot] commented 1 year ago

The issue had no activity for 30 days, mark with Stale label.

tisonkun commented 1 year ago

cc @michaeljmarshall @asafm is this suggestion valid?

asafm commented 1 year ago

I also considered adding a Tenant attribute in OpenTelemetry metrics (once voted approved and started working on it) and changing the namespace to be only the Namespace name. I'm ok with pulsar_namespace to avoid confusion with k8s. In OTel, all attributes have their domain prefix. I still need to figure it out - researching as there are Messaging Semantics standards that are being revised as we speak.

Of course, we need to retain the namespace attribute to avoid breaking anyone. Also, we need to change the existing Grafana dashboards accordingly.

Please note that there is yet to be a single place in Pulsar where you can ask: given a topic, give me its Attributes (labels). I planned to write one for OTel. Once you have that, you also need to make sure it's used everywhere. In OTel, I plan that this will be part of the metrics infrastructure so that Plugins will also use this. Otherwise, you would also get inconsistency in plugins reporting this.

@codelipenghui @hangc0276, any opinion

github-actions[bot] commented 1 year ago

The issue had no activity for 30 days, mark with Stale label.