Closed Malandril closed 2 hours ago
/cc @Ladicek (arc), @brunobat (opentelemetry), @manovotn (arc), @mkouba (arc), @radcortez (opentelemetry)
I cannot say what the correct behavior is but ArC is definitely not instrumented with OpenTelemetry tracing; see also the list of instrumented extensions: https://quarkus.io/guides/opentelemetry-tracing#quarkus-extensions-using-opentelemetry. So this is not a bug but more like a feature request.
Well, as the image of the log shows, the trace context is propagated correctly. It's just that it is ended before the asynchronous observer completes, which should be expected -- it's the end of the HTTP request that ends the trace, and there's no coordination facility for asynchronous observers (to be able to wait until all of them have finished). This is a classic context propagation problem -- when you fire and forget an action to which the context is propagated, how does that affect the ending of the context?
Well, as the image of the log shows, the trace context is propagated correctly. It's just that it is ended before the asynchronous observer completes, which should be expected -- it's the end of the HTTP request that ends the trace, and there's no coordination facility for asynchronous observers (to be able to wait until all of them have finished). This is a classic context propagation problem -- when you fire and forget an action to which the context is propagated, how does that affect the ending of the context?
Ladislav is right on the money.
If you need the context state preserved for the entire duration of observer method execution, then using @Observes
instead of @ObservesAsync
should do the trick.
We should not wait for independent async tasks that are spawn from the HTTP request. Not sure what work is being done, but if the task is independent, they must start their own trace and have it linked to the parent HTTP span, not a parent/child relationship. If the task is not independent and is needed for the request, @manovotn suggestion can do the trick. In any case, I suggest closing this issue.
Well, as the image of the log shows, the trace context is propagated correctly.
No, it is not propagated by Quarkus. In the reproducer, the io.opentelemetry.context.Context
is part of the event payload. So it's more like a "manual propagation". But I agree that it's expected that it could be ended before the CDI async observer completes.
BTW I'm not quite sure it is safe to use the io.opentelemetry.context.Context
on a different thread.
No, it is not propagated by Quarkus. In the reproducer, the
io.opentelemetry.context.Context
is part of the event payload.
Ah, OK, I didn't really look into the code :-) And looking at this issue, that seems correct.
Also, agree with Bruno that parent/child relationship is most likely incorrect in this case.
BTW I'm not quite sure it is safe to use the
io.opentelemetry.context.Context
on a different thread.
It should be, based on the javadoc:
A context propagation mechanism which can carry scoped-values across API boundaries and between threads. ... Context objects are immutable ...
Ok thanks for the replies, i think it should be documented then, that the @ObservesAsync does not correctly create Otel context as we encountered this in production a bit too late, as the simple cases where the async treatments where short we had the correct context.
Ok thanks for the replies, i think it should be documented then, that the @ObservesAsync does not correctly create Otel context as we encountered this in production a bit too late, as the simple cases where the async treatments where short we had the correct context.
Well, as I already mentioned it's documented in the OTel docs, more precisely it's not listed in the supported extensions/functionalities. And that's IMO the right place 🤷 .
Also, part of the openTelemetry specification, async tasks out of the main trace must specify links: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/overview.md#links-between-spans
The new Trace and the link to the parent span need to be specified manually in the user code.
Closing this as per the excellent replies already provided
Describe the bug
When using asynchronous CDI events, the opentelemtry contextis not correctly propagated to the async observer. If the request triggering the event finishes before the observer, the observer loses its trace and span.
Expected behavior
The
@ObservesAsync
method should have the same opentelemetry trace as the one who fired the event until the end.Actual behavior
The
@ObservesAsync
method can "lose" the opentelemetry context.How to Reproduce?
Reproducer here: https://github.com/Malandril/quark-otel-cdi
The reproducer demonstrates the loss of the traceId in the logs after the
Thread.sleep(1000)
.Output of
uname -a
orver
No response
Output of
java -version
21
Quarkus version or git rev
3.16.3
Build tool (ie. output of
mvnw --version
orgradlew --version
)No response
Additional information
No response