Open JonasKunz opened 2 months ago
What is the benefit of onEnding()
compared to pass the read/write span to onEnd()
?
Both are sync, both are called at the same time (within the Span.End()
api).
To my understanding both changes are breaking for SpanProcessor
implementers. Either they have to add a method or change the type.
What is the benefit of onEnding() compared to pass the read/write span to onEnd()?
One benefit is that while order of span processor registration matters for onEnding()
, it doesn't today and wouldn't with this change with onEnd()
. Implementers of onEnd()
can continue to rely on semantics where the they get an immutable span which cannot be impacted by other span processors.
To my understanding both changes are breaking for SpanProcessor implementers. Either they have to add a method or change the type.
We're working out the details of that in #4030, but I disagree that adding a new method to an SDK plugin interface is a breaking change. Languages vary in terms of how they expose these plugin interfaces and the language features for evolving interfaces, but there it should be possible for all SDKs to accommodate this type of change, albeit with some creativity in some cases. For example, its simple in java to add a new default method to an interface which existing implementers don't need to implement. This doesn't exist in some language like go, but there you can offer a new ExtendedSpanProcessor
interface which includes the new method, and provide a new mechanism for registering that new span processor interface with TracerProvider
.
@JonasKunz I am curious to know whether you think it is essential for the on-end handler being discussed in this PR to make changes that are visible to other processors. In my opinion, it would be better if the changes made on-end applied to only the exporter associated with the processor making the change.
In order to satisfy the discussion in #4010, I feel that we should not call the object passed to the on-end handler a "read write span". The span object which is newly live when passed to the OnStart handler is a real span ("writable") and it is a span data object ("readable"). I do not think that processors should receive a real span on end (with which they could use real span APIs, for example, e.g., place the span's context back into a context and invoke a new child span).
I think it makes sense for span processors to receive span data objects on-end (i.e., not real spans, but data objects). If at that point a processor wants to modify the span, they can modify the span data object which they pass to their own exporter, but their changes will not impact other processor/exporters.
In my opinion, it would be better if the changes made on-end applied to only the exporter associated with the processor making the change.
I agree that, in the spirit of keeping consistency around the expected behavior of the SDK, it does make sense to follow a similar approach for whether to make processors affect other processors vs only their associated exporter. And I think #4010 has good points on it, to be honest, the existing behavior is quite strange so I think it's great that it's being revisited there.
However, I'm wondering if this PR is the right place to have this discussion, as it seems to affect SpanProcessors in a broader way, given that, should we decide in #4010 to switch to processors with independent pipelines, wouldn't it mean that (for SpanProcessors) we should address OnStart()
as well? At least if we wanted to keep consistency in the overall SDK behavior anyway.
My understanding is that this PR is focusing on adding some extra functionality while keeping consistency with the existing patterns used in the rest of SpanProcessor
, hence the "read write span" and the fact that a processor affects others down in the chain, even if it's not ideal, I think it's important to keep a consistent behavior. Now, this would also mean that, if in the future we decide to go with the changes proposed in #4010, then both OnStart()
and OnEnding()
should adapt to it imo.
I think there are usecases for both variants
I guess something like a processing pipeline would be more flexible to allow putting one processor after the other.
This would avoid that meed to implement a BatchEnrichingProcessor
next to a SimpleEnrichingProcessor
just to get enrichment and batching/non batching. If there are more orthogonal functionalities this can get quite cumbersome.
There are already SpanProcessors which mutate in onStart
without exporting, e.g. here.
@JonasKunz I am curious to know whether you think it is essential for the on-end handler being discussed in this PR to make changes that are visible to other processors. In my opinion, it would be better if the changes made on-end applied to only the exporter associated with the processor making the change.
@jmacd In my opinion, this would remove the benefits this PR is intending to add: Making it easier for users to enrich spans in span processors before they are exported.
There is already a workaround for doing this with the existing onEnd()
callback: Users can decorate/wrap the SimpleSpanProcessor
/ BatchSpanProcessor
and wrap the incoming ReadableSpan
s in order to pass the mutated ones to the span processor invoking the exporter.
We actually implemented this exact mechanism to provide a span processor which collects stacktraces as attributes for long running spans here. The problem with this approach that it requires a big amount of boilerplate code and doesn't play well with SDK autoconfiguration. This PR was an example use-case for why we wanted to drive #1089 forward for providing a simpler, SDK-native approach for enriching spans on end.
So the problem I see with making changes in onEnding
not visible to other users is that it again pushes the responsibility to setup bootstrap pipelines to the user.
My take here is that there are two viable classes of solutions for allowing easy enrichment of spans via the SDK:
onStart
)This PR specs out solution category 1. In this comment I discussed both solution classes and also provided backwards-compatible PoC implementations for Java for both. However, the arguments for 2.) seemed to not be convincing enough (this was also discussed in a spec-SIG Meeting), I went for 1. in this PR.
We can definitely revisit this decision! However, if we do go for a pipelining approach, I'd propose to rather enhance the onEnd
callback for it instead of introducing a new onEnding
callback.
However, if we go for the pipeline approach, I'd propose to separate the construction of processors from the assembly of the pipeline by either:
Function
to onEnd
init
callback which receives the next pipeline stage as a functionThe reason why I'm suggesting is that if we leave the pipeline assembly to the user via decoration/wrapping, this makes the pipeline structure a blackbox for the SDK.
If instead the pipeline assembly is managed by the SDK, it has more control over it: e.g. it could insert logging or telemetry between the pipeline stages if necessary. It also plays much better with autoconfiguration.
However, I'm wondering if this PR is the right place to have this discussion, as it seems to affect SpanProcessors in a broader way, given that, should we decide in #4010 to switch to processors with independent pipelines, wouldn't it mean that (for SpanProcessors) we should address OnStart() as well? At least if we wanted to keep consistency in the overall SDK behavior anyway.
I'm not sure whether onStart
is a good candidate for pipelining. Pipelines make sense if you have some consumer at the end. While onStart
might have such consumers (e.g. creating a metric about started spans), it usually is used to mutate spans before they are visible to the user/application code. I don't see how the second aspect fits nicely into a pipeline model vs just invoking onStart
on all SpanProcessor
s in a loop.
Thank you @JonasKunz. I understand the motivation and agree with this change. Also, you've helped me understand how the change I'm looking for can be made orthogonally
@jmacd so if my understanding is correct, you are planning to add some kind of pipelining / chaining capabilities to OnEnd
(or renamed OnExport
), right?
I wonder if the addition of such pipelines would render the OnEnding
callback proposed here useless and whether we should flesh out the pipelining first before moving ahead with this PR.
The main differences in terms of capabilities between OnEnding
and the pipelining I can think of are the following:
OnEnding
mutates the original span, these changes are therefore visible to anyone holding a reference to that span. I don't know if this has any effect though: The user application might be holding onto API-Spans
, but those are write-only anyway to my understanding. So there wouldn't be any way for the application to actually observe those changes made in OnEnding
OnEnding
is guaranteed to be executed on the application thread ending the span. We might not guarantee this for the pipeline stagesOnEnding
is guaranteed to be invoked for every span (unless there is a bug in the instrumentation with dangling spans of course), this might not be the case for pipeline stages if we allow filtering thereSo what do you think? Should we move ahead with this PR and get it merged or should we wait and check the overlap with the pipeline approach you are coming up with?
@JonasKunz I assumed that you mean for OnEnding()
to receive a "read/write span object" the same as is passed to OnStart()
. I think it means that processors can read (the data object) and write (the span object).
OnEnding mutates the original span, these changes are therefore visible to anyone holding a reference to that span.
The requirements, as written, I think ensure that callers are not permitted to use the span reference after End()
is called, though the processors are still permitted to via direct access from OnEnding()
. There's a bit of conflict with:
It SHOULD be possible to keep a reference to this span object and updates to the span SHOULD be reflected in it.
I'd probably tack on "SHOULD be reflected in it until End() is called on the Span".
For the last two bullets, I don't see a problem. I expect the OnEnd() callbacks all to execute before the export begins (a.k.a. OnEnd()). Nothing is stated about when the OnEnd happens, but after your change it should be clear that pipelining effects (whatever they are) begin with the OnEnd call. I think OnEnding() makes sense the way you have it -- and there are real use cases so we don't need to block for future design work.
Discussed this in the context of #4062
@pellared Would you say that if #4030 is accepted, this PR should be approached differently? (To me, it seems like the answer is yes.) I've speculated about what the solution might look like in https://github.com/open-telemetry/opentelemetry-specification/pull/4062#pullrequestreview-2086466483, which is briefly for SDKs to support a "FanoutProcessor" that gives users control over whether mutations are private to their export pipeline and/or visible to the next processor.
@pellared Would you say that if #4030 is accepted, this PR should be approached differently?
I think this PR is fine and it is a logical extension. We should only highlight that processor does not need to implement the newly added method for backwards compatibility.
Fixes #1089.
In addition to the comments on the issue, this was discussed in the spec SIG Meeting on 2024/23/04:
SpanProcessor
s due to better conceptual fitSpanProcessor
s in a chaining fashion during the initial SDK spec design and it was actively decided against it. However, no one could recall the reason why.Based on this input, I decided to not move the chaining-based solution forward and stay with the original proposal of adding a new callback to be invoked just before a span is ended.
I also decided to name the new callback
OnEnding
instead ofBeforeEnd
as suggested in this comment. The nameOnEnding
is more precise about when the callback is invoked.A big discussion point in the SIG Meeting on 2024/23/04 also was the problem of evolving SDK plugin interfaces without breaking backwards compatibility. This PR contains a proposal to clarify how this should be handled: If the language allows it, interfaces should be directly extended. If not possible, implementations will need to introduce new interfaces and accept them in addition to the existing ones for backwards compatibility. I feel like this allow every language to implement changes to interfaces in the best way possible. Of course, changes to interfaces should still be kept to a necessary minimum.
I also wasn't sure whether this change warrants an addition to the spec-compliance-matrix, so I've left it out for now. Please leave a comment if you think this is required, then I'll add it.
Changes
OnEnding
callback toSpanProcessor
beforeEnd
in that PoC)CHANGELOG.md
file updated for non-trivial changesspec-compliance-matrix.md
updated if necessary