honeycombio / beeline-ruby

Legacy instrumentation for your Ruby apps with Honeycomb.
Apache License 2.0
22 stars 32 forks source link

Trace sizes #102

Closed ajvondrak closed 2 years ago

ajvondrak commented 4 years ago

Per the discussion in pollinators #general, I'm opening an issue to track a feature request that would be valuable across beelines. I'm just more familiar with the Ruby beeline, so I'm opening it here.

Problem Statement

With deterministic sampling, you're generally either sending an entire trace to Honeycomb or no events at all.[1] So, to investigate usage details and fine-tune your sampling, it's helpful to know how big your traces usually are.

You could figure out the size of traces by writing a query such as COUNT_DISTINCT(trace.span_id) GROUP BY trace.trace_id. But because you can't then query those results (like a nested query or a HAVING clause), you can't do more sophisticated things such as generating a heatmap & using Bubble Up to identify traffic patterns that lead to big traces.

So, to get a better look at the trace sizes, we'd need a queryable field like trace.size. This would be the number of events that share the same trace id - the cardinality of the whole tree, not just (say) the number of direct children.

Proof of Concept

Conceptually, every span that gets generated would increment the trace size. The simplest proof of concept could use the existing rollup fields feature:

diff --git a/lib/honeycomb/span.rb b/lib/honeycomb/span.rb
index 4214258..db48d18 100644
--- a/lib/honeycomb/span.rb
+++ b/lib/honeycomb/span.rb
@@ -34,6 +34,7 @@ module Honeycomb
       @sent = false
       @started = clock_time
       parse_options(**options)
+      add_rollup_field('trace.size', 1)
     end

     def parse_options(parent: nil,

Or, as a monkey-patch (for those those who might want to play with it in their own code despite the hackiness):

module Sizing
  def initialize(trace:, builder:, context:, **options)
    super
    add_rollup_field('trace.size', 1)
  end
end

Honeycomb::Span.prepend(Sizing)

But the way rollup fields work, this would give every non-root span a trace.size of 1. Then you'd have "gotcha" queries where you need to remember to specify WHERE trace.parent_id does-not-exist.

Still, you could easily imagine manually incrementing a counter on the trace as spans get generated, then dropping an add_field "trace.size", trace.size if root? in Honeycomb::Span#add_additional_fields.

Concerns

Footnotes

  1. This isn't quite true, since you could set different sample rates for different events in one trace. E.g., this happens in the forem/forem sampler discussed in a recent HoneyByte. :arrow_heading_up:

  2. It's kind of interesting to consider that distributed tracing headers work unidirectionally: upstream propagates to downstream. I wonder what other functionality a bidirectional protocol could open up? :arrow_heading_up:

  3. Depending on the sample hook's implementation, we actually needn't necessarily send every span of a trace. Even with deterministic sampling, we could have a case like footnote 1. Moreover, the sample hook is under no obligation to use the deterministic sampler. :arrow_heading_up:

MikeGoldsmith commented 4 years ago

Hey @ajvondrak - thanks for the considered request. We'll need to discuss internally to fully understand the issue and will get back to you.

vreynolds commented 2 years ago

Hello,

We will be closing this issue as it is a low priority for us. It is unlikely that we'll ever get to it, and so we'd like to set expectations accordingly.

As we enter 2022 Q1, we are trimming our OSS backlog. This is so that we can focus better on areas that are more aligned with the OpenTelemetry-focused direction of telemetry ingest for Honeycomb.

If this issue is important to you, please feel free to ping here and we can discuss/re-open.