Closed edwardsoo-ss closed 7 years ago
trace example https://tracingd.smartthingstlm.com/traces/fba876dc8882fa32 producer usage: https://git.vandevlab.com/iot/bouncer/merge_requests/72 consumer usage: https://git.vandevlab.com/iot/pusher/merge_requests/61
still has some problems:
@Bijnagte please have a look
Over all this looks great. The whole config map thing leave a bit to be desired but I don't see a way around that since its the way Kafka interceptor configuration works.
I'm a bit concerned that there isn't a way to pass in the current trace context into the ProducerInterceptor. If that isn't possible the whole interceptor pattern is going to be a non starter since that is the only way to create a new child span rather than starting a whole new trace.
I think it is nice to have this more generic approach. How will we migrate? I would think that we need to actually create a second topic in order to do this as there is not a good way to sniff the message to know when the cutover happened. I will comment on the actual usage PRs as well
@llinder updated producer interceptor to create span as child of current span managed by the kristofa brave lib, producer interceptor code runs in the same context so that wasn't a problem. However for some reason sometimes I will see 2 "sr" annotations e.g. https://tracingd.smartthingstlm.com/traces/437f35c7a4468fce I have tried using "ws" & "wr" instead but it still occurs (2 "wr").
I cannot pass the consumer received span to the server tracer because kafka client receives in batch, I think the consumer service may just have to know the envelope schema and do that in a non-transparent way.
@Bijnagte agreed, I will put some thoughts into how we can migrate (maybe another topic like you said)
bouncer -> kafka -> pusher -> paperboy & dove (no server spans): https://tracingd.smartthingstlm.com/traces/b55a0e715a1feea5
The kafka consumer code to join the trace from the kafka message envelope has become quite gross because of the batch receive issue. See https://git.vandevlab.com/iot/pusher/merge_requests/61
latest example: https://tracingd.smartthingstlm.com/traces/06636f5e85cf4a44
the consumer now returns TracedConsumerRecord
which extends ConsumerRecord
. consumer code can optionally check for tracing context and propagate them for spans caused by the kafka "wr" span.
tracing context injection/extraction in interceptors are now abstract, an implementation using a protobuf envelope message is provided as an example/default. "ws" & "wr" are still annotated and flushed in the interceptors.
Looking good!
Funny that the SQL query of all things has some rendering issues. I think I've seen issues like this running locally before so maybe its nothing.
Probably should be using "cs" and "sr" annotations though. This is the official way that Zipkin handles one way async spans like this so it has proper tests and is less likely to break the UI.
Looks like we have something amiss in the last sample trace.
The Server Receive/Send shouldn't be included with the Client Send/Receive. Also the timestamp seems to be absent on the SR/SS. This could be related to using the WS/WR annotations where the SR isn't correctly initialized in Pusher and the serverTracer.setServerSent()
is never called.
@llinder you are right about "ws" "wr" causing issues. I switched back to "cs" "ss": https://tracingd.smartthingstlm.com/traces/550db9e6f6f26fd3
example of how traces look if we treat kafka as just a transport (i.e. not naming any endpoint/service as "kafka"): https://tracingd.smartthingstlm.com/traces/c004e3574a476446
looks like the sql query time is still a bit wonky sometimes, even with "cs" and "sr": https://tracingd.smartthingstlm.com/traces/77b257bccebda8e2
I also added some tests.
We might want to consider additional annotations to help searching but those can come later once we learn from real world use.
Great job on this by the way! Really looking forward to having this in production. Feel free to merge when you feel your ready.
thanks! :smile:
This PR is just a place to get some comment, not meant to be final @llinder take a look if you have some time please