opentracing-contrib / java-kafka-client

OpenTracing Instrumentation for Apache Kafka Client
Apache License 2.0
125 stars 64 forks source link

Discrepancy span duration between interceptor and wrapper mode #57

Open ppatierno opened 5 years ago

ppatierno commented 5 years ago

Hi, I found a discrepancy in when a span is finished using the interceptor approach versus the wrapper one. As far as I understood, the duration time of a span is the time going from span.start to span.finish (defining the duration of the unit of work), so this discrepancy could give different durations. For example, let's consider the producer side with the interceptor.

https://github.com/opentracing-contrib/java-kafka-client/blob/master/opentracing-kafka-client/src/main/java/io/opentracing/contrib/kafka/TracingProducerInterceptor.java#L26

From the line above you can see that when the onSend is called, the span is started and finished without doing nothing Kafka related (if you dig into the buildAndInjectSpan method). The message is not sent yet, it happens after the onSend return the producerRecord (see [1] kafka producer source code). I would expect that this span duration is quite close to 0 every time.

Let's take a look at how the wrapper way works. You can see here the producer wrapper where the span is started.

https://github.com/opentracing-contrib/java-kafka-client/blob/master/opentracing-kafka-client/src/main/java/io/opentracing/contrib/kafka/TracingKafkaProducer.java#L117

After that, the span is passed to a TracingCallback instance and the actual send method is called. When the span is finished then? It happens in the callback code here.

https://github.com/opentracing-contrib/java-kafka-client/blob/master/opentracing-kafka-client/src/main/java/io/opentracing/contrib/kafka/TracingCallback.java#L38

As you know the callback is called when the message is acked in the way we set (acks=0, 1, all). It sounds to me that if setting 0, the span duration could be close to 0 ms as it happens for the interceptor approach, using 1 or worst "all" could mean a higher duration because of the latency and the real ack from the broker (or even followers involved for "all").

My question is why they didn't finish the span in the onAcknowledgement method in the interceptor approach, so here?

https://github.com/opentracing-contrib/java-kafka-client/blob/master/opentracing-kafka-client/src/main/java/io/opentracing/contrib/kafka/TracingProducerInterceptor.java#L31

[1] https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L853

IndeedSi commented 5 years ago

I had the same question on why not use onAcknowledgement. But after I tried in my local, I found that it's not easy to pass span from onSend to onAcknowledgement.

onAcknowledgement This method will generally execute in the background I/O thread, so the implementation should be reasonably fast. Otherwise, sending of messages from other threads could be delayed.`

As they are called in different threads, so we cannot use the thread local to store the span. And the InterceptorCallBack doesn't have a way to pass the span in, interceptors intances are shared between messages. @ppatierno I'm pretty new to opentracing. Could you point out a way in this case? Thanks