Open lmolkova opened 7 months ago
I think kafka instrumentation should record cluster-id as a server.address or node host if it can retrieve them. Will bring it up in messaging semconv to decide which one.
The description of server.address
in https://opentelemetry.io/docs/specs/semconv/attributes-registry/server/ reads Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name.
as far as I can tell cluster id does not fit that description.
Kafka instrumentation should do the best effort collecting server. attributes and if it's not possible should collect some network. attributes.
Unfortunately, at least to me, it seems that neither of these are easily collectible. For javaagent it is probably possible to get access to network peer with some extra instrumentation, but for library instrumentation this may very well be unattainable.
Thanks @laurit !
This is a great feedback to messaging SIG. I agree that it's not trivial. Possible approach could be to let users opt in into getting some info with describe cluster API call or DNS lookup (if we manage to get at least an IP).
Hi, @lmolkova . How is the progress going?
Hi @huange7! We'd welcome a PR for this issue if it's something you're interested in working on.
Hi @huange7! We'd welcome a PR for this issue if it's something you're interested in working on.
Hi, @trask ! I am interested in this issue, and I will attempt to work on it. Please assign to me.
Is your feature request related to a problem? Please describe.
Existing Kafka instrumentation does not capture any server/network details.
So in case I have multiple kafka clusters in my application, I cannot differentiate between them. I also don't know which node operation was done against.
I think kafka instrumentation should record cluster-id as a
server.address
or node host if it can retrieve them. Will bring it up in messaging semconv to decide which one.Describe the solution you'd like
Kafka instrumentation should do the best effort collecting
server.*
attributes and if it's not possible should collect somenetwork.*
attributes.Publish/consume spans should (as they already are) be created on the public api surface and cover duration of logical operation (with all retries).
In case multiple
server.address
es (ornetwork.peer.address
es) are available (e.g. tried one node and fell back to a different one), we want to report only the last node contacted on the logical operation.Network-level kafka instrumentation is not covered by messaging semconv and there is no guidance on how/if to instrument it at this point.
Describe alternatives you've considered
No response
Additional context
No response