open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.74k stars 888 forks source link

Non-immediate and Non-causal Links between Spans #826

Open eyakimov-bbg opened 4 years ago

eyakimov-bbg commented 4 years ago

What are you trying to achieve?

The current specification (references in additional context) only models relationships between Spans at a very basic level of parent and an arbitrary link (possibly with attributes). Systems can exhibit various kinds of links, such as "child of"/"follows from" or "immediate"/"non-immediate" which are known to the system but can not currently be expressed. This issue focuses on the latter distinction but opens up the wider area of defining the semantics of types of relationships/links that can exist and how they can be handled by downstream systems.

Consider the following use-cases where a relationship exists between Spans, however, the relationship is not "causal" (i.e. the previous span didn't directly cause the second span) and it may express a non-immediate linkage between them.

Although it is arguable whether this is a "Link" as it's not causal, it can add considerable value to the observability of a system and as such I'd like to prompt some consideration of whether the semantics of such a relationship should be supported.

This distinction between (non-)immediate/(non-)causal will likely have little impact at instrumentation time but may be very necessary for downstream handling of a tracing system. For example, It's unlikely that I'd ever look at a timeline/Gantt-chart for a non-immediate/non-causal chart. However I might want to easily navigate through all traces in a subscription, or likewise to see all consumption of a record/cached value.

Given the need to handle such relationships differently upon consumption, and given that this context is known at instrumentation time, it would be desirable to capture it, ideally by having well-defined link semantics that can be expressed when adding the Link.

Additional context.

The current specification allows Spans to both reference a parent as well as have additional links. As described in

A related issue exists https://github.com/open-telemetry/opentelemetry-specification/issues/65 which focuses on the child/follows relationship, but I think many of the same considerations would apply here as well.

I don't have a concrete suggestions for how to solve this, but I believe that some form of "link type" definition is necessary. This could consist of something like the following traits:

The above traits may also be somewhat related, i.e. I don't see how a parent or child concept would exist unless the link is causal.

Oberon00 commented 4 years ago

One problem I can see is that the parent-child relationship is a special kind of link that does not support attributes. However, they could be added as span attributes instead. Another problem is that both links and the parent span must be fully specified when creating the span already, so you can't add links later. Are you aware of this limitation? It seems like non-immediate links are exactly links added while the span is already started. There is an related issue #454 "Please (re)-allow recording links after Span creation time" Currently you could work-around this by creating child spans with the desired link.

eyakimov-bbg commented 4 years ago

Thanks for the input Christian,

One problem I can see is that the parent-child relationship is a special kind of link that does not support attributes

In some ways, I see the parent relationship as a constraint on the other traits (i.e. its probably causal and immediate), however, whether the child/follows trait probably does make sense to set on the parent relationship as per the illustration on https://github.com/open-telemetry/opentelemetry-specification/issues/65#issuecomment-595924506

Another problem is that both links and the parent span must be fully specified when creating the span already

Funny enough, it seems that I added that issue under a different GitHub account too ^^. I do think that this is valuable as it gives more flexibility to developers to discover relationships during the span processing rather than beforehand, however, IMO it doesn't change "how" you describe these relationships which is what this issue is focusing on.

(Apologies for the deleted comment, wrong account)