honeycombio / refinery

Refinery is a trace-aware tail-based sampling proxy. It examines whole traces and intelligently applies sampling decisions (whether to keep or discard) to each trace.
291 stars 91 forks source link

It would be nice to allow a dynamic SendDelay #1184

Open kentquirk opened 4 months ago

kentquirk commented 4 months ago

Is your feature request related to a problem? Please describe.

For async systems, sometimes different parts of the system need different amounts of time to expect the rest of the trace to arrive. A global SendDelay setting causes all traces to have to wait for the worst case.

Customer:

Async tasks where the trace context is carried across a message queue often complete long after the global SendDelay. It would be nice to catch these traces and wait longer for them to complete (in many cases, for us, the root span is an async function that returns almost immediately after dispatching work)

Describe the solution you'd like

If the root span has a numeric field called refinery.trace_send_delay, then instead of using the configured SendDelay, refinery will wait for the number of seconds specified in that field before deciding on that trace.

Describe alternatives you've considered

Additional context

Slack thread in pollinators

bixu commented 4 months ago

Ideally, we'd be able to tune delays around the sampling decision per-rule, since the context we want the rule to evaluate in is usually enough for us to know if we want to wait longer than normal (or ignore the root span closing early).

But also, in our environment, asking users to add refinery.trace_send_delay to their trace doesn't feel like excessive lift. The users most affected tend to understand tracing pretty well and the need for usable traces.

kentquirk commented 1 month ago

Came across a situation today where allowing SendDelay to be reset whenever a new span arrives would be helpful. (Long trace, variable number of async actions which arrive frequently but over many seconds; it would help to have a debounce model where the trace sends once spans stop arriving.)