open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.73k stars 888 forks source link

provide optional access to root span #2109

Open brettmc opened 2 years ago

brettmc commented 2 years ago

What are you trying to achieve? If I create an active root span immediately on application start (in my case, PHP so starting from nothing on an incoming http request), I might not yet have all of the information I need to create a meaningful low-cardinality root span name, add attributes, etc.

What did you expect to see? Later on (say, in some middleware, after I have bootstrapped an application and it has done some routing), I would like to do things like: tracer->getRootSpan()->updateName('simplified_route_name_with_placeholders')->addAttribute('user', 'username_extracted_from_token')

Additional context.

I've previously worked with a couple of APM tools, where the root span is called a transaction, and is accessible via an API method. The spec does say that a trace will have a single root span:

Each trace includes a single root span, which is the shared ancestor of all other spans in the trace.
Implementations MUST provide an option to create a Span as a root span...

That implies to me that root spans are special, but it's unclear to me whether it further implies that the root span should be accessible or not. I would like to see something in the spec to the effect of "A tracer MAY provide API access to the root span" (I imagine that it might not be applicable in all languages)

dimaqq commented 1 year ago

I too would like to set attributes on the “root-most” span in this process.

I see tracing as distributed, thus the real root span is just a short context reference.

Given a bunch nested spans, it would be handy to be able to bubble some attributes up to the highest available span. I think this means in process, and has not ended yet.

My use case, too, is that acting username or low control bucket is resolved in some descendant span, shortly after, but not immediately upon staring to process a request.

I feel that it’s rather common to progressively populate current context (in general sense, not OTEL) as request is being processed. Some bits I can think of: region, client ip, login user, permission level, affected business object id (think cart id), db cache key…

The alternative is to tag those in descendant spans, and rely on the trace processing or trace query engine to effectively support descendant span attributes, e.g. to summarise request processing time broken down by acting user.

If that’s the case, maybe a call for such functionality is all that’s needed?

lordpixel23 commented 1 year ago

I find myself frequently wanting some way to add attributes to the root span. e.g. given how tracing relates to timing things, I want to know how many items were processed or created as a part of the trace which we do not know when the root span is created. But having to remember and navigate up and down the traces and remember where the various attributes were added makes the traces very hard to use. One basically needs a notepad and paper to jot down the attributes.

Generalizing, one would probably want to think about process boundaries given the "true" root might have been created remotely.

tedsuo commented 6 months ago

This is a valid concern, and we have various solutions for this in different languages and instrumentation packages.

However, having access to the root span is problematic. Even though it is very common for the root span to be available, it is an invalid assumption to assume that the root span is still active. The span may have already ended, or even flushed out of memory.

arielvalentin commented 5 months ago

However, having access to the root span is problematic. Even though it is very common for the root span to be available, it is an invalid assumption to assume that the root span is still active. The span may have already ended, or even flushed out of memory.

I think others have laid out our use case for it as part of OTel Ruby.

We currently have helpers for dealing with specific use cases creating special context keys for the Ingress Server Spans, that way we are able to enrich it with attributes like http.route; which are resolved/determined by different frameworks after the server span was activated.

We do not want to rely on the "active span" because anyone could create an internal span along the way. Though I do understand the risk of a span being inactive; couldn't the active span become inactive at "any time"?

Am I thinking about this the wrong way? Is there a better option for this?

trask commented 5 months ago

Java Instrumentation has this concept LocalRootSpan

brettmc commented 5 months ago

Java Instrumentation has this concept

PHP will likely follow Java's lead here. "local root span" I think resolves any ambiguity w.r.t distributed traces and remote parents, and returning an invalid span in the case where the local root span is not found or no longer active.

dmathieu commented 5 months ago

This kind of goes in the sense of "wide events", where the local root span should be considered special, and hold most of the available attributes rather than have them split across many of them.

austinlparker commented 3 months ago

TC, any thoughts/objections to this?

carlosalberto commented 2 months ago

We discussed this at the TC call yesterday, and we are fine with exploring this. We will need prototypes in a couple of languages so we consider our options regarding:

1) Implementation: Do we implement a traversal? Do we store the root-most Span directly in the Context? 2) Exposition: do we expose this at the Context/API? In the SDK?

dmathieu commented 2 months ago

I am using this Go package, which adds a "main span" within the context, to be retrieved at any moment. https://github.com/dmathieu/owe

I'm not using the term "root", because it's not necessarily the root span, but rather "main" because it's intended to be the first span within the service. Hence the included HTTP and gRPC middlewares.

brettmc commented 2 months ago

PHP's implementation is heavily influenced by Java's implementation

We've taken the approach that when a span is activated, we check if its parent is remote or invalid. If yes, then it's a "local root span" and is stored in context. It can later be retrieved from context, via LocalRootSpan::current(). If there is no active span, than an invalid span is returned from this function.

Implementation: Do we implement a traversal? Do we store the root-most Span directly in the Context?

We have stored directly in context

Exposition: do we expose this at the Context/API? In the SDK?

We have exposed and implemented it at the API level, so that instrumentation libraries can use the feature without depending on the SDK.