open-telemetry / oteps

OpenTelemetry Enhancement Proposals
https://opentelemetry.io
Apache License 2.0
326 stars 157 forks source link

Automatic propagation of peer.service #247

Open carlosalberto opened 6 months ago

carlosalberto commented 6 months ago

Knowing the service name on the other side of a remote call is valuable troubleshooting information. The semantic conventions represent this via peer.service, which needs to be manually populated.

This information can be effectively derived in the backend using the Resource of the parent Span, but is otherwise not available at Collector processing time, where it could be used for sampling and transformation purposes.

Defining (optional) automated population of peer.service will greatly help adoption of this attribute by users and vendors explicitly interested in this scenario.

Based on https://github.com/open-telemetry/semantic-conventions/issues/439

jmacd commented 5 months ago

Related work, cc @bogdandrutu @kalyanaj https://github.com/w3c/trace-context/issues/550

carlosalberto commented 5 months ago

Hey @yurishkuro

Added sampling scenarios that may throw light into how useful this feature could be. Please review.

jmacd commented 4 months ago

@carlosalberto I think we should try to build in more protection against accidental propagation of the peer service information. Also, I'm afraid "upstream" can lead to confusion--although unless we do something to avoid accidental propagation, it's literal and true-- the upstream service name would be the nearest ancestor context that happened to set it.

I want us to consider a mechanism that helps us scope tracestate variables to limit their impact in the future. I'm thinking of an entirely new tracestate vendor code for information (e.g, to for "Transient OpenTelemetry") that would purposefully terminate after a propagation event. (I think of this approach as complimentary to the idea in https://github.com/open-telemetry/oteps/pull/207, which is for scoping state until the following propagation event.)

That said, this seems like an opportunity to allow peers to exchange more than only their service name. If we had a tracestate field for exchange of arbitrary peer-related variables, then the new SDK configuration knobs would be:

Then, a tracestate could be formed to exchange arbitrary attributes, like:

tracestate: to=peer.service:myservice;some.property:somevalue;etc:etc

Receivers would apply these variables to the context and drop them from the tracestate before creating a new context.