Closed IGx89 closed 1 day ago
Tagging component owner(s).
@g7ed6e
Hi @IGx89 This is be design according to the otel messaging specs that's why we introduced the process span support too. Please have a look to the discussions in the original PR and check the below PR that introduce process span support. https://github.com/open-telemetry/opentelemetry-dotnet-contrib/pull/1937
Thanks for the fast reply! For anyone else reading this, they have a ConsumeAndProcessMessageAsync
IConsumer extension method that you can call in place of consumer.Consume
which starts the activity, executes a callback in which you process the message, and then stops the activity.
I'm not sure if that'll work for us since it requires OpenTelemetry-specific modifications of our business logic, but being able to see the proper way to handle this per OpenTelemetry spec is very helpful.
Since almost everyone using this instrumentation will probably want to use this extension method, it might be beneficial to document in your README -- hard for people to find otherwise (I spent half an hour reading through issues and the code before creating this issue and still didn't find it).
Component
OpenTelemetry.Instrumentation.ConfluentKafka
Is your feature request related to a problem?
After I call consumer.Consume, Activity.Current is null and so any logs, HTTP requests, SQL updates, etc. I make while processing the message are not tied to the message that triggered them. That prevents me from easily pulling up a trace of a message and seeing if it was successfully processed or not by consumers.
What is the expected behavior?
The current implementation of this component both starts and stops the Activity inside the consumer.Consume method call, preventing any business logic that processes the message (logs, Redis, SQL, HTTP, etc.) from being correlated to that message. That seems to go against the goals of distributed tracing, ending the trace in the middle of the work.
Datadog's tracer accomplishes that by leaving the activity open after Consume is called and closing it next time Consume is called, which in a typical consume loop is immediately after the message is processed. You can look at their code here: https://github.com/DataDog/dd-trace-dotnet/blob/0070285865b391ac1db44682aa24ead9c903dad1/tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/Kafka/KafkaConsumerConsumeIntegration.cs#L66
Which alternative solutions or features have you considered?
I found https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/#consumer-spans which appears to document how things should work here. This section here seems to suggest an alternative solution:
It's not clear how one would do that here though, without effectively re-implementing all the logic of this component (parsing headers, adding tags, etc.). If I'm going to do all that I may as well not use this component at all.
Additional context
No response