Inferred spans can cause cycles in parent-child trace structure

JonasKunz commented 2 months ago

@AlexanderWert recently stumbled upon a problem where inferred-spans would cause a cyclic parent-child relationship in the generated trace. This cycle was formed because the child_ids of an inferred span would refer to a span which is actually a transitive parent of said inferred span.

The problem occured with GRPC due to the root span being activated multiple times nested.

However, I think this uncovered a more general flaw in our current approach of assigning child_ids. We basically always assume that when a span A is activated replacing span B as the current span, that A would be a child of B. This is usually the case: most of the time a span represents a single method invocation and is activated exactly for the duration of that method call. However, concept wise the span parent-child relationship is actually completely independent of any activation!

Consider the following example:

  Span spanA = tracer.spanBuilder("a").startSpan();
  Span spanB = tracer.spanBuilder("b").setParent(Context.root().with(spanA)).startSpan();

  void doWithA(Runnable r) {
    try(var scope = spanA.makeCurrent()) {
      r.run();
    }
  }

  void doWithB(Runnable r) {
    try(var scope = spanB.makeCurrent()) {
      r.run();
    }
  }

  void doStuff(Runnable r) {
      r.run();
  }

  doWithB( () -> doStuff( () -> doWithA( () -> sleep(1000))))

In this example, spanA is the parent of spanB when looking at the trace context. However, when looking at the stackframes we have:

doWithB is the root stack frame (and has spanB activated)
doStuff is the child frame of doWithB
doWithA is the child of doStuff (and has spanB activated)

Our current inferred spans algorithm would now generate an inferred span for doStuff which would be a child of spanB and would have spanA as a child (via child_ids). This is now a cyclic reference:

spanB is the child of spanA (per their construction trace context)
doStuff is a child of spanB
spanA is a child of doStuff (via child_ids)

Taking a step back, why do we actually have child_ids? The problem is that we sometimes need to insert inferred spans as parents of already sent non-inferred spans. However, we should only do this if the parent-child relationship of the non-inferred spans actually corresponds to the parent-child relationship of the activations when looking at the stack frames! Because otherwise the parent-child relationship represents something different than the callstack and therefore we should not mess with it.

An easy way to fix this is to only allow a non-inferred span to be placed in the child_ids of an inferred-span, if the tracecontext parent of the non-inferred is actually the parent of the inferred-span (or to be more precise: the first non-inferred transitive parent of the inferred span). This way, we will only introduce "middle-men" in the existing parent-child relationship of non-inferred spans and therefore are guaranteed to not introduce cycles.

Note that while this approach will guarantee us to not introduce cycles, it still doesn't guarantee that the parent-child relationships still form a tree. Instead, it may decay into a directed, acyclic graph. Consider the following example:


  //Thread 1
  doWithA( () -> doStuff1( () -> doWithB( () -> sleep(1000))))

  //Thread 2, 10 minutes later, but using the very same spans!
  doWithA( () -> doStuff2( () -> doWithB( () -> sleep(1000))))

This code will result in two inferred spans (doStuff1 and doStuff2), where both will have spanB in their child-Ids and both have spanA as parent. IMO the only meaningful way of solving this is to ensure that a span referenced only at most once in child_ids. In theory we could use a SpanValue for this, the problem is that we do the inference delayed and don't have access to the originally activated spans anymore. However, if https://github.com/async-profiler/async-profiler/issues/913 gets implemented, then we can do a live-stream based inference instead of spilling the span activations to disk. This would allow us to use SpanValues to preserve the tree structure.

jackshirazi commented 2 months ago

I think we don't need to put too much effort into edge cases. The backend should just ignore inferred spans children which are in the stack above

AlexanderWert commented 2 months ago

I think we don't need to put too much effort into edge cases. The backend should just ignore inferred spans children which are in the stack above

Unfortunately it's not an edge case but the first think that occurred when running that feature with the OTel demo.

elastic / elastic-otel-java

Inferred spans can cause cycles in parent-child trace structure #200