Closed ppatierno closed 2 years ago
Hey @ppatierno ,
NOTE: The main point to highlight is that the HTTP request span is created in a thread, but the Kafka operation one is created in a different thread
The "OpenTelemetry way" to solve this would be to attach the current context (at the time of the HTTP request processing) to the async task that you're adding to a queue/executor. If you're using an Executor
or ExecutorService
consider wrapping them with Context.taskWrapping()
so that they automatically attach context to submitted tasks; alternatively, you can wrap the tasks (Runnable
, Callable
, etc) themselves using Context.wrap()
.
With OpenTracing/Jeager client everything works fine because it injects an
uber-trace-id
in the Kafka record headers which is used in the producer interceptor thread (using just the default TracingProducerInterceptor class) to get the span context of the HTTP request and make the new span as child of it (more details here).
The OpenTelemetry Context is much more than just a traceId&spanId pair (e.g. you can put baggage there, or completely custom things that you want to propagate down), and we strongly recommend propagating the entire context within a single application.
Thanks for the explanation @mateuszrzeszutek !
The problem I see is when you don't have access to such an executor or thread because it's inside the library you are using. Still making the point with Kafka client ...
When on a KafkaProducer
you call the send
method, what it does is actually putting the message into an internal buffer and then another internal thread in the library will take care of it for sending. This is the thread which executes the producer interceptor.
So I have an HTTP span in the thread I have control and I call the KafkaProducer.send
, but the Kafka tracing will happen in the interceptor (on a different thread) and context will be different without a way for me to "inject" the right context because a "reference" to that thread is not available somehow.
It's for Kafka but could be for any other library? Is there anything I am missing?
Hmm, KafkaProducer
calls the interceptor as the first thing when send()
is called -- and our interceptor creates the producer span onSend()
. Are you sure you're not using some kind of async wrapper over it?
Hey @mateuszrzeszutek I actually wrote this right after my previous comment but forgot to click on "Comment" button! LOL
Yeah it's in the same thread, I got confused by the underneath network thread for reading batch and sending messages.
So the problem is the Vert.x wrapper usage around it.
In this case we have a workaround about injecting an ExecutorService
wrapped with the right context.
Even if I will investigate why at least the interceptor part in Vert.x is not done in the right thread but IIRC it just wraps around the native send in an executeBlocking
so using a separate worker thread :-(
Anyway I can close this because I have got what I needed, thanks!
Ah, I see now. Yeah, you'll need to somehow pass the context along with the record; or implement context propagation in vertx. We don't have a library instrumentation for that, but I see that vertx has a Context
concept itself - perhaps it is possible pass the OTel context using it.
My feeling is that we should instrument our application by using the "official" vertx-tracing support https://github.com/eclipse-vertx/vertx-tracing I will investigate but not sure how much it's completed.
Vert.x seems to have a TP support for tracing here https://github.com/eclipse-vertx/vertx-tracing Of course refactoring everything is not feasible on our side and the feature is TP but we could try to use the VertxContextStorageProvider instead of the default one which is based on thread local.
This Vert.x storage seems to be able to correlate Vert.x context with OpenTelemetry context and that could help. Right now I am not able to set this storage yet :-(
Try setting the system property io.opentelemetry.context.contextStorageProvider
to the full name of the VertxContextStorageProvider
class somewhere right at the start of your application, before you first use/load OpenTelemetry.
@mateuszrzeszutek actually just adding the dependency to the vertx-opentelemetry
library seems to work with a kind of auto-discovery. Yesterday, I didn't notice that that dependency was already there but with test
scope so not available at runtime.
Now it works out of box. Does it sound right to you? Or we should set the system property above anyway?
Or we should set the system property above anyway?
Looking at the code, you only need to setup sys prop if you have multiple providers on the classpath -- and you need to pick an explicit one. We don't have that (for now), so I wouldn't bother -- it will show if we need it, as it will fail at startup. :-)
Looking at the code, you only need to setup sys prop if you have multiple providers on the classpath -- and you need to pick an explicit one.
Yeah, that's correct, I forgot about that.
Does it sound right to you? Or we should set the system property above anyway?
That look alright, you don't have to set the property after all.
Glad that it's working for you!
I was comparing how OpenTracing (with Jaeger client) and OpenTelemetry work when it comes to share context across threads in the same application.
We are currently using OpenTracing in our Strimzi HTTP - Kafka bridge, where we create a span when an HTTP request comes in for sending a message to Kafka and then leaving the Kafka instrumented library to create the corresponding Kafka operation span which is set as child of the HTTP request span. NOTE: The main point to highlight is that the HTTP request span is created in a thread, but the Kafka operation one is created in a different thread (because it's done through the Kafka producer interceptor mechanism).
With OpenTracing/Jeager client everything works fine because it injects an
uber-trace-id
in the Kafka record headers which is used in the producer interceptor thread (using just the default TracingProducerInterceptor class) to get the span context of the HTTP request and make the new span as child of it (more details here).When moving to OpenTelemetry, we noticed that the default
TracingProducerInterceptor
doesn't work out of the box. It gets the current context (by using Context.current()) but as per my understanding the context is handle by aThreadLocal
based storage so what is set as context (in the thread where the HTTP request span is created) is not available in the thread executing the producer interceptor code.The only way we found until now was creating a custom
ContextAwareTracingProducerInterceptor
(inheriting fromTracingProducerInterceptor
) and getting the HTTP request span somehow by making it as part of the current context and then invoking the base class. Something like this ...I was wondering if this is the only "hacky" way to do such a thing (context across threads) with OpenTelemetry and why it isn't implemented as OpenTracing, or maybe there is something obvious we are missing.
The question could be more general: how to have context propagated across threads in the same application?