Open rajushrajan opened 1 week ago
Hi,
I believe the span metrics from the Metrics Generator can help you achieve what you want:
https://grafana.com/docs/tempo/latest/metrics-generator/span_metrics/
These metrics include additional labels, based on the trace data, for instance, the name of the service that generated the span. You can even define custom labels.
I will also point out we've recently added "usage trackers" which will be in Tempo 2.7:
https://github.com/grafana/tempo/pull/4162
These will allow you to breakdown received bytes/second by any span or resource labels (namespace, cluster, etc) and publish those metrics directly from the distributor. (no metrics generator/prometheus required)
Hi @joe-elliott , Thank you for your response. I will explore the usage trackers and get back to you.
Hi everyone,
I’m working on a Prometheus alert to trigger when traces are missing for any component in Tempo. Currently, I have the following query, which triggers an alert when there are no traces available for a specific time window (e.g., 5 minutes):
sum by (cluster, namespace) (avg_over_time(tempo_ingester_live_traces[5m])) == 0
This works well for triggering an alert when no traces are ingested for the entire system (across any components) within the specified time window. However, I need to modify the query so that the alert is triggered when traces are missing for any component within a specific namespace or cluster.
How can I modify the query so that it triggers an alert when traces are missing for any component (not just globally or for a specific component ) within a cluster or namespace? I want the query to check for missing traces for each component, rather than globally.
I am using Tempo for trace ingestion and Prometheus for monitoring. The metric I’m working with is
tempo_ingester_live_traces
, which is labeled by component, namespace, and cluster.