open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.08k stars 2.37k forks source link

[processor/transform] Set attribute values using connection context #33288

Open ptodev opened 5 months ago

ptodev commented 5 months ago

Component(s)

processor/transform

Is your feature request related to a problem? Please describe.

The attribute and resource processors have a from_context config argument which can be used to set attribute values using connection context:

  # Key specifies the attribute to act upon.
- key: <key>
  action: {insert, update, upsert}
  # FromContext specifies the context value to use to populate the attribute value. 
  # If the key is prefixed with `metadata.`, the values are searched
  # in the receiver's transport protocol additional information like gRPC Metadata or HTTP Headers. 
  # If the key is prefixed with `auth.`, the values are searched
  # in the authentication information set by the server authenticator. 
  # Refer to the server authenticator's documentation part of your pipeline for more information about which attributes are available.
  # If the key doesn't exist, no action is performed.
  # If the key has multiple values the values will be joined with `;` separator.
  from_context: <other key>

It would be nice to be able to do the same thing with the transform processor.

Describe the solution you'd like

The solution should work with all OTTL contexts, so probably we need to add a new OTTL function?

Describe alternatives you've considered

No response

Additional context

This feature will help reduce the need for attributes and resource processors.

Also, #18643 suggests that the transform processor may replace the attributes and resource processors. Having this feature would be a prerequisite for that.

github-actions[bot] commented 5 months ago

Pinging code owners:

github-actions[bot] commented 5 months ago

Pinging code owners for pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 3 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

odubajDT commented 3 months ago

Hey @TylerHelmuth I would like to look at this issue

TylerHelmuth commented 3 months ago

@odubajDT @evan-bradley I'd like to talk about this more. We don't have syntax or a specific transformcontext to represent how to extract something from the connection context. I'd like to see some proposals on how we'd implement this before making a PR.

evan-bradley commented 3 months ago

Could we have that discussion on a draft PR? I would prefer to look at something concrete when considering possible designs. I think the high-level design is fairly straightforward (get a value from the context through a path/converter), and the implementation is what's under consideration.

evan-bradley commented 3 months ago

As an initial proposal, we'll want a ctx/context path that is required to be keyed by the path parser and calls (context.Context).Value with the key, and returns the return value (an any-typed value or nil).

As for the transform context, the most conventional way to do this would be to add context.Context as the first parameter to each NewTransformContext function, but this is something we can explore in a concrete implementation. Once the transform context has the context.Context reference, it's just a matter of retrieving it through a path.

If the implementation shows flaws, we can probably pull this off using a converter that relies on an interface implemented by the transform context, but I'd rather save this as a plan B.

TylerHelmuth commented 3 months ago

We keep breaking NewTransformContext functions as we need access to more data. Should we add an Option pattern?

evan-bradley commented 3 months ago

I think for now it's conceptually simpler for all data to be required. I think we will always have a context.Context object that can be used and can avoid the situation where certain context paths don't work in certain components.

In regards to breaking NewTransformContext, as non-ergonomic as it is from a function signature perspective, if we want everything to be required I'd rather break this at compilation time. The alternative is we implement a config struct or options pattern that we have to validate at runtime for containing the correct data, which would place the burden on users instead of component authors if something is required but not present.

TylerHelmuth commented 3 months ago

Ok lets try with just breaking it.

Are there any security concerns here? It seems likely you could do something like merge_maps(attributes, ctx, "upsert"), and expose keys.

evan-bradley commented 3 months ago

I dug into it and there's no way to iterate over context.Context without reflection. The path parser will need to require that you key the ctx path when using it. I think that should protect us from situations like the one you described, since if you ask for a key, it's your responsibility to know what's in it.

The one tricky thing is that any complex key/value types will be basically impossible to do without custom OTTL extensions. I'm not sure how we properly address this situation:

ctx := context.WithValue(ctx, StructKey{key: "key"}, CustomStructData{data: contextData})

Keying this would be impossible with OTTL, and even if you use e.g. a string key here instead, using the data is impossible unless you can e.g. convert the type to a map. I think this will just have to be a known caveat.

TylerHelmuth commented 3 months ago

Requiring an index should be enforceable by the ottl context path function. I agree we should only support string keys.

evan-bradley commented 3 months ago

I agree we should only support string keys.

We don't have to enforce this, (context.Context).Value takes an any type as a key and we can just pass a user-supplied value directly and return whatever Value returns (nil or the value). I was just pointing out that non-primitive typed keys won't work out-of-the-box.

Thinking on it, users could add Converters to provide support for complex key types, e.g. ctx[KeyFunc("key")] where KeyFunc returns whatever type they use to key context values. So I think we should be good to go here.

TylerHelmuth commented 3 months ago

In that case, I think we should be extremely strict about NOT adding any KeyFunc converters to pkg/ottlfuncs.

odubajDT commented 2 months ago

One additional question here, if I understand the implementation of attributes processor correctly, there is the possibility to extract context data only from the following context keys:

- "metadata."
- "auth."
- "client.address"

What in this case means connection context? Do we want to duplicate the exact same functionality or something else as well? Thanks in advance!

evan-bradley commented 2 months ago

What in this case means connection context?

The Collector passes a context.Context object down the pipeline alongside a payload, and usually receivers will include information about the network connection inside of this object.

Do we want to duplicate the exact same functionality or something else as well?

I think the fact that we have the ability to spread operations over multiple statements means that we don't have to special-case this kind of addressing. For example, we could have the following: ctx["auth"]["api-token"] instead of needing to write auth.api-token like it would be in the attributes processor.

github-actions[bot] commented 1 week ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.