Event-driven and reactive systems

opentracing / specification

A place to document (and discuss) the OpenTracing specification. 🛑 This project is DEPRECATED! https://github.com/opentracing/specification/issues/163

Apache License 2.0

1.17k stars 182 forks source link

I don't know much about tracing. I am trying to apply tracing to event-driven/reactive code (both in-process and cross-process) in order to make event chains more visible and to provide profiling data for optimizations.

Event-driven and reactive systems have inverted call stacks. While normal traces look like A.produceX() -> B.produceY() -> C.getZ(), event-driven and reactive traces look like C.handleX() -> B.invalidateY() -> A.updateZ(). This looks odd, but all the information is still there.

The problem becomes apparent when one of the lower-level components decides that propagating some event upwards is unnecessary. The trace then looks like C.handleX() -> B.invalidateY(). Object A, the owner of the whole object graph, is left out, including all identifying information in its tags. Such headless traces are often quite useless.

As a workaround, I am currently propagating tags down the ownership tree (A's tags to B and C and B's tags to C). This works, but it's messy, results in repetitive tags in spans, and adds overhead.

I was thinking that maybe opentracing could include explicit modelling of ownership relationships. Every span could have an owner and every owner could have a higher-level owner. Owners could have tags.

In event-driven and reactive programs, events are first subscribed before the backflow of events starts. This subscription is a good opportunity to capture and remember the owner relationship that can be subsequently added to all spans.

The code in question is a bit too lengthy and complicated to post here. Consider simple parent-child pair with callback being called inside the child:

class Parent {
    Child child = new Child();
    Parent() {
        // subscribe to events in child
    }
}
class Child {
    void someCallback() {
        Span span = GlobalTracer.get().buildSpan("Child.someCallback").start();
        try (Scope scope = GlobalTracer.get().activateSpan(span)) {
            // possibly propagate the event to parent via some listener
        }
    }
}

If the child callback doesn't propagate the event to parent, information about the parent will be completely omitted from the trace.

As a workaround, I have a helper class OwnerTrace that is used like this:

class Parent {
    Child child = OwnerTrace.of(new Child())
        .parent(this)
        .tag("key1", "value")
        .target();
    Parent() {
        OwnerTrace.of(this).tag("key2", "value");
        // subscribe to events in child
    }
}
class Child {
    Child() {
        OwnerTrace.of(this).tag("key3", "value");
    }
    void someCallback() {
        Span span = GlobalTracer.get().buildSpan("Child.someCallback").start();
        OwnerTrace.of(this).fill(span); // fills in key1, key2, key3
        try (Scope scope = GlobalTracer.get().activateSpan(span)) {
            // possibly propagate the event to parent via some listener
        }
    }
}

The key point is that OwnerTrace.parent() creates ownership hierarchy and OwnerTrace.fill() pulls tags from all ancestors.

There are many issues with this solution. The most basic one is that this is effectively a non-standard extension to opentracing that is not supported by anything and limits code sharing.

The above example is trivial and as such allows for many workarounds, but in practice I get ownership chains that are 5 levels deep, spanning several internal libraries, and child objects usually don't know anything about their current parent.

opentracing / specification

Event-driven and reactive systems #140