Propose Context-scoped attributes.

Oberon00 commented 2 years ago

Add Context-scoped telemetry attributes which typically apply to all signals associated with a trace as it crosses a single service.

This OTEP aims to address various related demands that have been brought up in the past, where the scope of resource attributes is too broad, but the scope of span attributes is too narrow. For example, this happens where there is a mismatch between the OpenTelemetry SDK’s (and thus TracerProvider’s, MeterProvider’s) process-wide initialization and the semantic scope of a (sub)service.

The context-scoped attributes allows you to attach attributes to all telemetry signals emitted within a Context (or, following the usual rules of Context values, any child context thereof unless overridden). Context-scoped attributes are normal attributes, which means you can use strings, integers, floating point numbers, booleans or arrays thereof, just like for span or resource attributes. Context-scoped attributes are associated with all telemetry signals emitted while the Context containing the Context-scoped attributes is active and are available to telemetry exporters. For spans, the context within which the span is started applies. Like other telemetry APIs, Context-scoped attributes are write-only for applications. You cannot query the currently set Context-scoped attributes, they are only available on the SDK level (e.g. to telemetry exporters, Samplers and SpanProcessors).

Context-scoped attributes should be thought of equivalent to adding the attribute directly to each single telemetry item it applies to.

Rendered: https://github.com/dynatrace-oss-contrib/oteps/blob/feature/context-scoped-attributes/text/0207-context-scoped-attributes.md Note: A table of contents can be found under the icon at the top left corner, if you go to the rendered view.

Oberon00 commented 2 years ago

(Note that the check-errors step seems stuck at "Waiting for status to be reported")

Oberon00 commented 2 years ago

There are many possible design variations on these attributes. What you suggest as workaround with "setting the attributes on the span and then on the Context" could maybe be avoided with Alternative: Associating Context-scoped attributes with the span on end and setting attributes on parent Contexts, in addition to generally using the span as primary/only storage as you say (I think this is Alternative: Associating attributes indirectly through spans). Then you would just set the attributes on the Context and activate it, and they would be associated with the span on end. Less work, but also less intuitive and the attributes are not available to samplers / OnStart callbacks.

Oberon00 commented 2 years ago

Mentioned briefly in spec meeting today. @carlosalberto suggests to implement a prototype in some language.

I think for spans this should even be doable without changing core API/SDK code, just by using the SpanProcessor's OnStart callback.

fbogsany commented 2 years ago

In Ruby, we have implemented a similar mechanism using Context to propagate attributes to HTTP client spans. We've also implemented this for Redis client spans.

An example use from the Koala instrumentation, which instruments a Facebook API client, is:

          def graph_call(path, args = {}, verb = 'get', options = {}, &post_processing)
            OpenTelemetry::Common::HTTP::ClientContext.with_attributes('peer.service' => 'facebook', 'koala.verb' => verb, 'koala.path' => path) do
              super
            end
          end

Faraday is a HTTP client library. It's instrumentation merges in the attributes from the HTTP::ClientContext with:

          def span_creation_attributes(http_method:, url:)
            instrumentation_attrs = {
              'http.method' => http_method,
              'http.url' => url.to_s,
              'net.peer.name' => url.host
            }
            config = Faraday::Instrumentation.instance.config
            instrumentation_attrs['peer.service'] = config[:peer_service] if config[:peer_service]
            instrumentation_attrs.merge!(
              OpenTelemetry::Common::HTTP::ClientContext.attributes
            )
          end

Honeycomb has also implemented a similar mechanism in their "hello world" examples, called CarryOn

  module CarryOn
    extend self

    # the key under which attributes will be stored for CarryOn within a context
    CONTEXT_KEY = ::OpenTelemetry::Context.create_key('carry-on-attrs')
    private_constant :CONTEXT_KEY

    # retrieve the CarryOn attributes from a given or current context
    def attributes(context = nil)
      context ||= ::OpenTelemetry::Context.current
      context.value(CONTEXT_KEY) || {}
    end

    # return a new Context with the attributes given set within CarryOn
    def with_attributes(attributes_hash)
      attributes_hash = attributes.merge(attributes_hash)
      ::OpenTelemetry::Context.with_value(CONTEXT_KEY, attributes_hash) { |c, h| yield h, c }
    end
  end

  # The CarryOnSpanProcessor reads attributes stored under a custom CarryOn key
  # in the span's parent context and adds them to the span.
  #
  # Add this span processor to the pipeline and then keys and values
  # added to CarryOn will appear on subsequent child spans for a trace only
  # within this service. CarryOn attributes do NOT propagate outside this process.
  class CarryOnSpanProcessor < OpenTelemetry::SDK::Trace::SpanProcessor
    def on_start(span, parent_context)
        span.add_attributes(CarryOn.attributes(parent_context))
    end
  end

Oberon00 commented 2 years ago

Yes, CarryOn looks exactly like what this proposes (though for Spans only). Does that satisfy the prototype condition, @carlosalberto?

Oberon00 commented 2 years ago

(close & reopen because CI still seems stuck)

garthk commented 2 years ago

G'day! I suspect the name Scoped Resources might better match your intent. The word "context" strongly suggests the execution path to Go programmers, and "attributes" suggests to me a shorter lifespan than you've proposed for these key-value pairs eg. the name of a sub-service for the lifetime of the process.

Separately, I can't see how this proposal satisfies the needs for which I proposed a BeforeEnd callback, eg. adding a span attribute for the CPU work done by the thread during the span, or recording ambient but changing information about the system eg. RAM usage. Perhaps I'm missing something. Could you clarify how your proposal would satisfy those use cases?

garthk commented 2 years ago

I had a nagging thought I'd somehow mis-read the spec, came back, and confirmed it. I think I got confused by the scope attributes comparison, but that's on me; I have no suggested edits for clarity. I've struck out my first paragraph above. Do set me straight if I've similarly blundered on the second.

Under OTLP protocol changes, I'd prefer not to treat these attributes as resources. That might seem appropriate for the http.app example but I'm more tempted to use this feature for enduser.id et al.

Oberon00 commented 2 years ago

eg. adding a span attribute for the CPU work done by the thread during the span, or recording ambient but changing information about the system eg. RAM usage. Perhaps I'm missing something. Could you clarify how your proposal would satisfy those use cases?

It doesn't satisfy these, the BeforeEnd callback would still be useful for these. But it solves problems closely related to the original problem the BeforeEnd callback was proposed for, namely attaching attributes to (nearly) all spans of a trace. I also doesn't do that 100% in the way the original author of https://github.com/open-telemetry/opentelemetry-specification/issues/1089 proposed (solutions closer to that are discussed in the alternatives section of the OTEP though).

Oberon00 commented 2 years ago

Under OTLP protocol changes, I'd prefer not to treat these attributes as resources.

Right, that doesn't make sense. You probably read one of the alternatives, because the main text suggests to implement the attributes such that the Tracers/Meters/... would just attach the attributes directly to the Spans/Metrics as normal attributes, so that exporters won't even be able to tell the difference, so there is no protocol change required. Merging context-scoped attributes with resources is only mentioned as alternative implementation to an optional addition to an alternative 😃

garthk commented 2 years ago

Indeed I did, and I endorse your conclusion there. Good luck with your OTEP!

jmacd commented 2 years ago

For the record, Lightstep internally uses a span+metrics instrumentation library that predates OpenTelemetry. In that library we have a Scope construct that is not quite like any of the ones you described, but substantially similar. Briefly described:

A primitive method exists to start a new Span/Scope. The initial Scope variables are configured as attributes of a new Span. During the span's operation, any metric operation automatically inherits the attributes of the scope.

Like the proposal here, these attributes are independent of baggage context, which propagate unconditionally. We did not automatically translate scope attributes to internal, non-service-boundary-crossing span contexts, although a study of the internal code suggests that most of the time, this would be desirable.

There is an Scope.Add() operation to append new scope attributes--new scope attributes are added to the span when it ends, and they impact future metric operations in the current scope. This is somewhat like Associating Context-scoped attributes with the span on end and setting attributes on parent Contexts.

Oberon00 commented 2 years ago

@bogdandrutu

Baggage is both readable and writable for the application, while execution-scoped attributes, like all other telemetry attributes, are write-only.

Clearly baggage is a super set of this proposal.

That's the "Rectangle extends Square"-kind of flawed superset. I believe, we intentionally designed Span as write-only for the application to not make applications rely on data added to the span. The same issue exists with Baggage vs. Context-scoped attributes.

Baggage only supports string values.

Do we really care about other values?

I do, and I believe many others will. Besides, having limited attribute types would simply be inconsistent, and may cause issues e.g. with defining semantic conventions.

Baggage is not available to telemetry exporters (although e.g., a SpanProcessor could be used to change that), while that's the whole purpose of execution-scoped attributes.

This is a long time request, and can be added.

I guess it can. I would say, defining the precise association (the attributes at which point counts for a Span for example) is maybe the main issue of this proposal. If we define such an association between telemetry items and baggage, and also allow non-propagation and access from span processors, and we add additional metadata to define which of the baggage attributes should be copied to telemetry item's attributes (as we don't want to send potentially sensitive application data to telemetry backends), and if we can live with leaking telemetry data back to applications, then we don't need the additional Context-scoped-attributes concept.

All in all, it seems kind of forced to me to burden the baggage spec with this additional use case. In total it may be more complicated than adding this new context-scoped attributes concept.

yurishkuro commented 2 years ago

I agree that baggage is not the solution to the problem presented, because baggage is propagated between processes, while context stays within a process.

But I don't like the proposal for a different reason. Context already supports typed attributes. This PR adds a mechanism to elevate some of them to be force-included in the telemetry. I'd argue that it's not the instrumentation's job to make such decision, but a job for configuration mechanism. The same mechanism can select baggage attributes to affect telemetry, if desired.

Oberon00 commented 2 years ago

@yurishkuro

This PR adds a mechanism to elevate some of them to be force-included in the telemetry. I'd argue that it's not the instrumentation's job to make such decision

Interesting, I don't see this PR like that at all. In my view, this PR adds a mechanism for instrumentations to add telemetry attributes that are scoped not to the whole application (like resources) an instrumentation scope, or a single item, but to a context. The use case I used in the diagram is a very concrete one for example, that I think could be included in a Java web server / servlet instrumentation. Making something configurable is always possible (e.g. also for span attributes) but might not always make sense.

Context already supports typed attributes. This PR adds a mechanism to elevate some of them to be force-included in the telemetry.

Maybe I should add a section to distinguish this from Context as well. Some points that I can come up with right now:

Context keys are not strings but unique objects returned by CreateKey
Context values are not attributes but arbitrary objects
Contexts are not enumerable
Contexts entries are meant to be set & interpreted in-process for particular use cases depending on the entry. They are not "key-value pairs"/attributes (see also previous point)
OpenTelemetry recommends language library authors use existing Context concepts instead of creating an otel-specific one. So even if you could enumerate the context, there could be arbitrary not telemetry related stuff there.

So what we could add would be a an API to enlist certain context keys as attributes to be set on telemetry items (if available). This would nearly work as a behind-the-scenes change to the currently proposed API, except that you would need to pre-allocate the keys with a new API, and use the key objects in the AddContextScopedAttributes call. That seems like a more awkward API to support fewer use cases with a more complex implementation.

yurishkuro commented 2 years ago

This PR adds a mechanism to elevate some of them to be force-included in the telemetry.

Interesting, I don't see this PR like that at all. In my view, this PR adds a mechanism for instrumentations to add telemetry attributes that are scoped not to the whole application (like resources) an instrumentation scope, or a single item, but to a context.

I think you're saying the same thing. The key part in your statement is "telemetry attributes". If this was just about "attributes" that stay in the context only within the process, then it's the definition of the Context. When you add "telemetry" I read it as "force-included in the telemetry".

Maybe I should add a section to distinguish this from Context as well. Some points that I can come up with right now:

No objections to this, but none of these features are required for the goal that you're after. String keys are also valid keys for the context. If we agree that force-including attributes in telemetry is bad and it should be an opt-in configuration, then the existing Context seems perfectly adequate for the goal. The configuration can list a collection of attribute keys, which, if found in the Context, should be included in the telemetry. Lack of enumerability and your other points about the Context do not prevent doing that.

Oberon00 commented 2 years ago

@yurishkuro

The key part in your statement is "telemetry attributes"

Indeed. The general context is not meant for (directly storing) telemetry attributes.

The configuration can list a collection of attribute keys, which, if found in the Context, should be included in the telemetry. Lack of enumerability and your other points about the Context do not prevent doing that.

I described how that could look like in my last comment, and as said I believe it would be possible but "a more awkward API to support fewer use cases with a more complex implementation". I can see no advantage at all. Configurability could be later added as an add-on to the currently proposed API as well.

If we agree that force-including attributes in telemetry is bad and it should be an opt-in configuration

No, I don't agree to that and I don't understand what you mean by "force" here (EDIT: So maybe I don't agree because I don't fully get the motivation here). It's "forced" in the same way that span attributes are force-included into exported telemetry data.

Oberon00 commented 2 years ago

Regarding Baggage (continuing from https://github.com/open-telemetry/oteps/pull/207#issuecomment-1205185260) @bogdandrutu

I thought about it again and I think that depending on the language/client library and the future evolution of baggage, a baggage-based implementation of Context-scoped attributes might indeed make sense. I added that in the alternatives section in 94d344d

bogdandrutu commented 2 years ago

@Oberon00 my main worry here is that we are confusing the users more with adding more concepts. I may be the only one who has this concern, which in that case I am an outlier, but personal preference is to avoid adding new concepts, when we can work on extending already existing once by enhancing them.

tigrannajaryan commented 2 years ago

@Oberon00 my main worry here is that we are confusing the users more with adding more concepts. I may be the only one who has this concern, which in that case I am an outlier, but personal preference is to avoid adding new concepts, when we can work on extending already existing once by enhancing them.

I echo this. My first comments was because of the confusion I had.

yurishkuro commented 2 years ago

I agree with @bogdandrutu and @tigrannajaryan .

No, I don't agree to that and I don't understand what you mean by "force" here (EDIT: So maybe I don't agree because I don't fully get the motivation here). It's "forced" in the same way that span attributes are force-included into exported telemetry data.

@Oberon00 These scenarios are not equivalent. Span attributes exist for users to store values in them, that's not "forcing". And by recording an attribute in the span I do not affect other telemetry signals simultaneously. But these context-scoped attributes are meant to affect all telemetry, which is why I refer to them as "forced". Say I import a library that decided to add some context-scoped attributes - suddenly all my telemetry gains extra dimensions, especially the metrics where the cardinality has direct correlation on the ingestion/storage costs.

I guess another way to frame it is like this: the PR proposes an opt-out mechanism, and I prefer opt-in. Context already supports attributes, anyone can add them. An opt-in approach would be a configuration mechanism where the service owner can say "I want these context attributes to be automatically included into these types of telemetry". It would extend the SDK configuration surface, but keep the OTEL API surface (the Context API) intact.

Oberon00 commented 2 years ago

@yurishkuro OK, I think I understand your concern with metrics & configurability. The proposed Context-scoped attributes API is giving libraries great power with which comes great responsibility. Adding configurability allows the user to share that responsibility and work around broken instrumentation libraries that add too many attributes.

I do think that the Context API is not well-suited for this as-is though. Configurability is independent from the API used to supply the attributes.

What could be made configurable? Thinking out loud:

Either an include-list or exclude-list of attribute keys (user should have the choice which one to use; exclude-list is only possible if attributes are marked as candidates for being telemetry-attributes)
There should be a default include/exclude-list and an optional per-signal list that work together (i.e. attributes are filtered by the default list and then after that by the per-signal list).
An empty include-list should be possible and mean that the feature is disabled (generally, or for particular signals)
Read-access for a custom span/other signal processor could be added so that it could do more complex logic for picking attributes, which would make sense in combination with restrictive include/exclude lists.

I think this is all a nice-to-have though. What would be easier & possible with the currently proposed API: Recommend all libraries (1) use context-scoped attributes very sparingly (2) provide a way of disabling them or even make them opt-in if they are not critical.

I.e., I would rather add the configurability to the libraries and not the SDK, this is also what we have done with span attributes with the Opt-in level and partially the Recommended level in #2522.

And by recording an attribute in the span I do not affect other telemetry signals simultaneously. But these context-scoped attributes are meant to affect all telemetry, which is why I refer to them as "forced".

Actually I think that just spans would be most important to me. I just thought that affecting all telemetry signals would give the most logical & easy concept of "Context-scoped attributes". But I would be OK with defining them to just affect spans if that would be considered less "forced". Would this be preferable?

Oberon00 commented 2 years ago

@bogdandrutu

my main worry here is that we are confusing the users more with adding more concepts

OK. But wouldn't two simple concepts and one complex concept be about the same mental overhead in this case?

Actually, I just now looked at the Baggage spec and found it described as:

Baggage is used to annotate telemetry, adding context and information to metrics, traces, and logs.

But is that actually true today? That would be exactly the use case here, but I don't think it's actually covered.

yurishkuro commented 2 years ago

Baggage is used to annotate telemetry, adding context and information to metrics, traces, and logs.

I think ^ is wrong and quite different from W3C Baggage definition.

jmacd commented 2 years ago

suddenly all my telemetry gains extra dimensions, especially the metrics where the cardinality has direct correlation on the ingestion/storage costs.

Let's not keep repeating this point. This "inflation of cardinality effect" should not be the case.

It's certainly not the case for Prometheus and OpenMetrics-based systems, and the reason is that creating a new scope attribute adds one attribute to one scope, not to every piece of telemetry in the scope. We are still working out the specification, but as shown in https://github.com/open-telemetry/opentelemetry-specification/pull/2703, users are meant to join scope attributes with metrics if they want to at query time. The PromQL group_left operation (e.g.,) specifically supports this expansion, so we do not need to worry about cardinality inflation due to scope attributes; this follows from the OpenMetrics-documented approach to handling target information (which must not inflate cardinality).

Baggage is used to annotate telemetry, adding context and information to metrics, traces, and logs. I think ^ is wrong and quite different from W3C Baggage definition.

Exactly, this is why the work @Oberon00 is doing is important. Somehow, we settled on APIs where there is no OpenTelemetry way to control telemetry attributes scoped to a span or a group of spans and the logs and metrics that happen within. This appears to be a common pattern, and we should try to standardize something to help users IMO.

yurishkuro commented 2 years ago

I.e., I would rather add the configurability to the libraries and not the SDK, this is also what we have done with span attributes with the Opt-in level and partially the Recommended level in #2522.

@Oberon00 in contrast, I would much rather add it to SDK once as a standard mechanism than to learn different configurations for different libraries that might decide to spam the context with undesired attributes.

Somehow, we settled on APIs where there is no OpenTelemetry way to control telemetry attributes scoped to a span or a group of spans and the logs and metrics that happen within. This appears to be a common pattern, and we should try to standardize something to help users IMO.

@jmacd This isn't an argument that helps choosing between this proposal and the SDK configuration alternative - they both end up resolving the user issue.

Oberon00 commented 2 years ago

@yurishkuro you cannot solve this problem with just configuration. You still need an API to supply these attributes. For me the question is only if we extend the baggage API or add a new API. I don't think the context API is suitable for this. And I also don't think the baggage API can be used as-is, we at least need a baggage API that excludes an attribute from the actual baggage. Otherwise, in the absence of configuration, we would fill the the baggage with telemetry attributes that the user sees when enumerating the baggage and that would be propagated.

@jmacd

the reason is that creating a new scope attribute adds one attribute to one scope, not to every piece of telemetry in the scope.

This proposal suggests to add the attribute to every piece of telemetry in the scope though (mainly to make the concept & the implementation simpler).

Maybe we now need to list more concrete use cases to find what is better suited. The two I know and would like to solve are already listed in the OTEP:

http.app attribute to determine which application handles a request: Would be set by certain HTTP server instrumentations on the Context containing the HTTP sever root span.
faas.id, faas.name: In Azure Functions, a function application hosts multiple functions, and something that normally would be a resource attribute should be set on every span from that particular function. An Azure Functions instrumentation would set it on the Context containing the FaaS root span.

In both cases:

The only thing that stops the attributes from being resources is that there could be multiple different values in the scope of a resource
The attributes could semantically also be set on a local root span (as is the case currently for faas.id/faas.name), but this has certain limitations, see "Alternative: An implementation for traces on the Backend".

yurishkuro commented 2 years ago

@Oberon00 I agree that baggage API is not suitable for this purpose, but the Context API is exactly the scope that you're targeting. I agree that there are some issues with using Context API (e.g. it allows all key & value data types), they but seem quite workable, and to me it's still much more preferable to use it and keep it small than to add another API surface and the mental complexity that comes with it.

Oberon00 commented 2 years ago

I agree that baggage API is not suitable for this purpose, but the Context API is exactly the scope that you're targeting.

The Context API is missing enumeration. That's a deal-breaker, unless you only want to support explicit include-lists, which would be undesirable. This feature also cannot be added to the Context API since we allow language libraries to use their language's built-in context APIs and those might not support enumeration and cannot be changed by us.

Oberon00 commented 2 years ago

@bogdandrutu Is there any roadmap or relevant open PRs issues for the Baggage API? Maybe in scope of the OpenCensus transition guidance? Would be interesting to see how extensions that would allow Context-scoped attributes could fit in there.

Oberon00 commented 2 years ago

So I'll try to summarize the status here:

The use cases seems common enough https://github.com/open-telemetry/oteps/pull/207#pullrequestreview-1006473604, https://github.com/open-telemetry/oteps/pull/207#issuecomment-1222757697
Implementability is rather trivial, it is even possible to provide an implementation for spans on top of the current API, e.g. as done at https://github.com/open-telemetry/oteps/pull/207#issuecomment-1162017633
The particular API, however, raised concerns as it introduces yet another concept.
1. @bogdandrutu thinks that this should be implemented as an extension of the baggage concept instead of a completely new concept. I added this as a possible implementation in https://github.com/open-telemetry/oteps/pull/207#issuecomment-1212220401 but this does not address that the OTEP language still introduces a new concept https://github.com/open-telemetry/oteps/pull/207#issuecomment-1220772595. @tigrannajaryan https://github.com/open-telemetry/oteps/pull/207#issuecomment-1220793830 shares this worry.
2. @yurishkuro agrees with my initial assesment in the OTEP that the baggage API is not suited for the use case this OTEP proposes to solve and suggests to instead direct users to use the raw Context API. https://github.com/open-telemetry/oteps/pull/207#issuecomment-1205280150
@yurishkuro also thinks that it is absolutely vital that there is a way for the user to configure which "offered" Context-scoped attribute is actually used because the current proposal "forces" all attributes set using the API to actually be set on all the associated telemetry items. https://github.com/open-telemetry/oteps/pull/207#issuecomment-1221114221

Oberon00 commented 2 years ago

I'd prefer not to change the OTEP as either proposed in 3.i or 3.ii, not because I'm lazy, but because I actually think the current proposal is better than any of the proposed alternatives. The exception would be if a more powerful baggage API is on the roadmap anyway, then it wouldn't make sense to actually use that API.

Using the Context API directly is IMHO far too cumbersome and low-level (it requires explicit registration of keys, does not restrict attribute types, is not enumerable so would require some explicit list of attributes, not allowing for exclude-list based configurations even if you want to implement the configuration from point 4).

As for point 4, I think configurability is a nice-to-have but should not be required for an initial specification / release. I could be convinced to add a simple per-signal (per TracerProvider/MeterProvider/...) on-off switch to the proposal though since it seems somewhat useful and probably trivial to implement.

Oberon00 commented 2 years ago

It would be nice that this proposal include how to propagate context with messages

I think that should be clear from the proposal already: With messages, it should not be propagated at all, just like with any other cross-service calls. Context-scoped attributes are meant as a "service-local" concept.

jmacd commented 2 years ago

@Oberon00 I appreciate your summary of the current debate. I worry that the word "Scope" creates a lot of confusion. I thought it might help to inject the word "Dynamic" to your proposal here. Whereas we have recently introduced a static-form of Scope attribute in the OTLP protocol, that kind of scope attribute is "Static" by comparison. Students of programming languages will recognize these two forms of scoping as Lexicographic vs. Dynamic variable scoping.

I support the creation of a new Context-based API for Dynamic Scope attributes, which are those that roughly speaking: (1) associate with all telemetry, (2) do not propagate. The API will look and feel exactly like the baggage API, but it will not be the same.

Oberon00 commented 2 years ago

Maybe it would be simpler to use "Context attributes" and just drop the "scoped"? EDIT: No, I guess that is already taken. "Telemetry attributes from Context"?

The instrumentation-scope attributes are also dynamically scoped.

But from the current discussion points, I did not have the impression so far that "scope" caused confusion?

The idea was initially coined as "Scoped resource attributes" internally, going from that and the use case from the motivation, "per-request resources" might be a nice (though slightly incorrect) moniker.

jsuereth commented 1 year ago

So, my $.02 here:

I'd like to see an interaction w/ baggage where you can opt in specific attributes to participate in telemetry. E.g. I can configure "source.id" or "sessiond.id" as included baggage attributes and then all telemetry collected in teh SDK automatically includes these attributes if they are on baggage.
A second mechanism for in-process scope-based attributes. Specifically, a location where I can place attributes that do NOT get exposed downstream in baggage but do participate in context propagation. This could use the same filtering mechanism as baggage for inclusion.

I'm overall supportive of the general change BUT I would like to see this interact better with baggage even if you see this needs different behavior than what OOTB Baggage gives you.

TL;DR: Let's make baggage better and I also see a need for in-process context scoped attributes.

Oberon00 commented 1 year ago

The baggage specification is currently both Stable and in feature freeze. I think if we want to add this to the baggage spec, the TC first needs to lift the feature freeze there. https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md The freeze was added for 1.0 and might be just a leftover? https://github.com/open-telemetry/opentelemetry-specification/pull/1121 EDIT: It seems that tracing is also in feature-freeze and despite that, the scope attributes were added. So it seems this is not really related here but rather a general issue that should be addressed.

tedsuo commented 1 year ago

@Oberon00 we are cleaning up stale OTEP PRs. If there is no further action at this time, we will close this PR in one week. Feel free to open it again when it is time to pick it back up.

Oberon00 commented 1 year ago

I would still like to see this OTEP merged eventually. But what would be needed here is agreement on what direction it should take regarding Baggage and configurability and how much is required for an initial version of the OTEP.

Oberon00 commented 1 year ago

@bogdandrutu I think you didn't reply on this PR since your "request changes" review https://github.com/open-telemetry/oteps/pull/207#pullrequestreview-1055913542 to which I did answer. If you think it is still applicable nevertheless, please say so. No matter if this PR is closed or not, it would be useful to know for any reopening/next version.

yurishkuro commented 1 year ago

My alternative proposal is to implement processors in the collection pipeline (like span processor, i.e. different per signal), which take a simple configuration of two arrays containing baggage keys and context keys, and enrich the respective signal with the attributes extracted from Baggage and/or Context.

no changes to public APIs, does not require any new roadmaps
opt-in solution, does not allow some run-away instrumentation to screw up the rest of the telemetry
a low-level, composable building block that can be simplified in the future with higher-level facade if it proves necessary and a generality can be derived from multiple use cases

From reading earlier comments I do not see what is not being solved with this approach. The only glitch is that Context allows an attribute to be an arbitrary object, but it doesn't feel like a major roadblock, buyer beware, the processor will try to convert the non-primitive objects to a string.

Oberon00 commented 1 year ago

that can be simplified in the future with higher-level facade

Isn't my proposal exactly that? You suggest to specify a particular implementation so that low-level context APIs can become the public API, while my OTEP specifies a higher-level API (the implementation would most likely be exactly what you suggest anyway though).

About it being opt-in, I think this point I can still add to the OTEP, as I metioned in https://github.com/open-telemetry/oteps/pull/207#discussion_r1119890696:

I suggested an extremely minimal "configuration" I could add in https://github.com/open-telemetry/oteps/pull/207#issuecomment-1242175278, namely a simple per-signal on-off switch that the user would set e.g. when creating the MeterProvider to indicate whether they want context attributes for that signal.

It seems this runaway instrumentation scenario is your greatest concern here. Would that be resolved if we added that minimal configurability? I would rather have it opt-out though, a runaway instrumentation can always mess up your traces anyway in different ways (we could consider opt-out for traces and opt-in for other signals).

yurishkuro commented 1 year ago

Isn't my proposal exactly that?

Yes, you propose to skip the low-level building blocks and jump straight to high level convenience API, which may never be needed. And there is a huge difference between making changes to the OTEL API vs. OTEL SDK. API changes impose a new contract on all implementations - what if we find that we don't like that contract later? My proposal can be implemented right now with a contrib library -- no spec, nor API, nor even SDK changes.

There is also the fundamental difference of opt-in vs. opt-out. I don't like giving users a gun they can shoot themselves with. The opt-out solution is even worse - they can shoot someone else. A minor upgrade of some 3rd party library that suddenly decides to set these attributes can triple someone's Datadog bill, break existing dashboards, etc. My proposal is no-op until the actual end user decides to enable this functionality.

Oberon00 commented 1 year ago

My proposal can be implemented right now with a contrib library -- no spec, nor API, nor even SDK changes.

Is this also possible for metrics & logs or only traces? How would instrumentations interact with that library?

Note that we have this part in the Context API spec which could be problematic for your proposal (https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/context#create-a-key):

The key name exists for debugging purposes and does not uniquely identify the key.

Consequently, there is no API to get an attribute by name, nor to enumerate all context attributes. So the library would need to somehow receive the actual key objects from the instrumentations. Could be done, with two libraries: An API one which just offers a key registration API (key object + configurable name) and a SDK one, which then would offer the actual functionality and access the registered key list.

And after all, I'd still like to have some kind of cross-technology common description of what the API should do, so I'd still like to have this in the spec, as that is the only place for such things. I would be fine with creating a completely new component beside SDK and API for this API extension, although I wonder if this is maybe too much ceremony

Regarding:

The opt-out solution is even worse - they can shoot someone else.

I think I don't understand this. If they disabled context attributes for tracing, whom would they "shoot"?

yurishkuro commented 1 year ago

I think I don't understand this. If they disabled context attributes for tracing, whom would they "shoot"?

What I mean is an instrumentation library author will be in position to affect all telemetry of an end-user's application. I don't think this is a reasonable ownership model. Today a library author can only affect their library's instrumentation, but the end user can turn it off via named tracers / meters.

Oberon00 commented 1 year ago

the end user can turn it off via named tracers

Although there was that intention originally behind the tracer's scope name parameter, such a mechanism was never specified or implemented. I suppose, users would turn a instrumentation off at a different level, by disabling it in their instrumenting agent or not calling the initialization function for example. This would then also help against libraries polluting the context.

austinlparker commented 6 months ago

Curious if this could get a once-over to make sure it's still current. Been a lotta changes over the past few years. Generically, I think it's a good proposal (iirc, @jmacd @tedsuo and I whiteboarded out a similar scoping proposal back in 2019 that didn't make it in) and would solve a lot of problems for folks.

pyohannes commented 6 months ago

I also came across use cases where this would be very useful, one in particular was the ability to associate outgoing with incoming requests. This could be achieved if each incoming request sets a well defined context attribute.

I wonder if there's still interest from the original author @Oberon00 in this.

Oberon00 commented 6 months ago

I still think the feature is useful, but we kind of hit a stalemate here in the reviews. If anybody wants to meet the demands of reviewers, for example regarding baggage integration and configurability, feel free to submit an extended version of this OTEP, although IIRC it may not be possible as some requests are mutually exclusive, i.e. you probably cannot find a design that fulfills all requests at the same time. At the moment, I don't plan to update this.

trask commented 6 months ago

for those familiar with Java logging MDC, this can be viewed as a generalization of logging MDC to spans, metrics, and logs

open-telemetry / oteps

Propose Context-scoped attributes. #207