grafana / tempo

Grafana Tempo is a high volume, minimal dependency distributed tracing backend.
https://grafana.com/oss/tempo/
GNU Affero General Public License v3.0
4.05k stars 524 forks source link

Metrics-generator should take sampling rate into account #1515

Open yvrhdn opened 2 years ago

yvrhdn commented 2 years ago

Is your feature request related to a problem? Please describe. Traces are often downsampled before they are ingested by Tempo. Since the metrics-generator emits metrics based upon ingest volume, these metrics will also be downsampled.

For example, traces_spanmetrics_calls_total emits how often a specific operation happend. If traces are downsampled by constant sampling percentage 50%, the metric will be 50% lower as well than reality. This is likely misleading as you have to highly aware of downsampling happening in the trace pipeline.

With more involved down sampling setups (tail sampling, per operation sampling) it becomes impossible to calculate the actual value.

Describe the solution you'd like The metrics-generator should pick up the sampling rate used and adjust the emitted metrics. Ideally, it extracts the sampling rate from metadata attached to ingested spans.

Describe alternatives you've considered

Additional context

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity. Please apply keepalive label to exempt this Issue.

faridtmammadov commented 4 days ago

This solution may not be effective in cases of tail sampling, where only specific spans are retained after sampling. An alternative approach could be to send all spans to the metrics generator before applying sampling. Do you think Tempo should support this capability? I’d appreciate your thoughts and suggestions.

joe-elliott commented 3 days ago

An alternative approach could be to send all spans to the metrics generator before applying sampling. Do you think Tempo should support this capability

It would be significantly cheaper to do the metrics generation in pipeline before sending to Tempo, but perhaps this makes sense for smaller installs that want a simpler pipeline?

@yvrhdn is working on something related to this now, but I think the OSS-ness is up for debate. yuna, do we know what if anything will be open source from your current work?

if that doesn't pan out what are you thinking @faridtmammadov? maybe a parameter to downsample in the distributor? send N% of received spans to ingesters and everything received to generators?

faridtmammadov commented 3 days ago

We currently generate span metrics in the OpenTelemetry Collector before tail sampling. However, for the new "Explore Traces" feature, local blocks are required. Sampling in the distributor, as you suggest, seems like a good solution. It would not only support setting spans as exemplars but also open the door for developing adaptive sampling techniques.

joe-elliott commented 2 days ago

However, for the new "Explore Traces" feature, local blocks are required.

It is unlikely for normal production tracing volumes to be able to store everything in Tempo. Sampling has been a consideration in distributed tracing since Google's Dapper paper. Explore traces has value even if tracing data is downsampled.

What I would like to see is OTel add a way to track downsample rates so they can be reinflated in TraceQL metrics if desired by the user.

I'm not opposed to a simple change here, but there is a lot going on with the distributor/ingester/generator relationship due to an ongoing rearchitecture that prevent extensive work.

faridtmammadov commented 2 days ago

Your plan leans towards head-based sampling since it provides consistent metrics without the delays and skew caused by tail sampling. Even if OpenTelemetry adds downsampling rates, the wait time inherent in tail sampling can still distort metric accuracy.

yvrhdn commented 2 days ago

This solution may not be effective in cases of tail sampling, where only specific spans are retained after sampling.

Yes agree, with most tail sampling policies it won't be possible to reconstruct the metrics. If you value the accuracy of the metrics, I would recommend to generate the metrics before any tail sampling happens. They could be generated by the OTel Collector or Grafana Alloy instance that is also tail sampling.

Otep-235 should make it possible to reconstruct sample rates from probabilistic sampling: https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/trace/0235-sampling-threshold-in-trace-state.md But again, this doesn't work with tail sampling unfortunately.

In the current Tempo architecture it's impossible to generate spans metrics and service graphs in Tempo without downsampling. We can't really solve this short-term, especially since we are redesigning the write path around Kafka.

the wait time inherent in tail sampling can still distort metric accuracy.

This is another issue we would have to fix in the metrics-generator: ingestion delay introduced by the tracing pipeline decreases accuracy. The metrics-generator would have to be designed around the span timestamps to prevent this. This opens another can of worms since it spans can arrive out-of-order and most metrics backend don't accept out-of-order samples.