open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.09k stars 2.38k forks source link

Please make it easy to do simple string interpolation using the OTTL library to facilitate developing other components #34700

Open michaelsafyan opened 3 months ago

michaelsafyan commented 3 months ago

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

I'm working on a component in which it is necessary to interpolate signal information into a string to construct a URI representing a destination for upload. This is the Blob Attribute Uploader Connector https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33737 .

What I'm trying to do is to be able to something like:

```
   # In the configuration:
   uri_pattern: gs://${env.GCS_BUCKET}/${trace_id}/${span_id}/${span.attributes["the.upload.name"]}/${len(resource.attributes["foo"])}/bar
```

I think OTTL already has a lot of this logic existing within the notion of "OTTL contexts", but it is somewhat difficult to reuse this logic for a few reasons:

  1. The functions are contained in the transformprocessor rather than in the pkg/ottl
  2. The path parser is a hidden component
  3. The existing parser library does much higher level stuff that assumes solely the transform use case, including parsing of conditions and statements

It's also a bit hard to work with the templating here (it's not obvious where K represents the signal type or where K represents the result type that the function is intended to produce).

In my in-progress work on blobattributeuploadconnector, I've implemented the interpolateSpanEvent and interpolateSpan functions with a bit of hackery in OTTL:

But I suspect someone more familiar with OTTL can implement this more cleanly.

Describe the solution you'd like

Given the necessary inputs to construct an OTTL TransformContext, it would be ideal if it were simple to interpolate a string using that context.

It would be convenient if the code could look like:

    interpolator, err := ottl.NewInterpolator(transformContext, WithOsEnvInterpolation())
    // ...

    interpolatedString, err := interpolator.InterpolateString(inputString)

However, I'm open to other interfaces (as long as it is relatively simple to do).

Describe alternatives you've considered

Initially, I tried to build something like this from scratch and quickly realized that I was likely reinventing the wheel. OTTL looked promising as a way to share ideas around how to refer to the properties of the input signal.

I think extending/modifying the existing OTTL library is the correct approach. The alternative would be for me to attempt to do it myself, but I am likely to get this wrong.

Additional context

No response

github-actions[bot] commented 3 months ago

Pinging code owners:

TylerHelmuth commented 3 months ago

@michaelsafyan as with https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34695 I am missing context here.

Does the Blob Attribute Uploader Connector have a requirement that users should be able to specify a regex that will be used to extract fields from a string? Do those fields need to be used to set telemetry fields, such as span id, or will they be used for something else within the component?

The functions are contained in the transformprocessor rather than in the pkg/ottl

All general OTTL functions live in pkg/ottl/ottlfuncs, although the transform processor does maintain some of its own metrics transformation functions.

The path parser is a hidden component

How an OTTL context interprets a path is up to that specific context. It is not exposed because contexts are not made to be extendable, they are made to represent a very specific OTLP telemetry structure, based closely on pdata/the otlp proto.

The existing parser library does much higher level stuff that assumes solely the transform use case, including parsing of conditions and statements

I am not sure what you mean here. I agree that the public API for parsing "statements" is not as friendly as the filter/select use case wrapped in internal/filter/filterottl, but I'm not sure what other use cases besides transform and select that the parser should be considering.

michaelsafyan commented 3 months ago

Does the Blob Attribute Uploader Connector have a requirement that users should be able to specify a regex that will be used to extract fields from a string?

No. To clarify, the logic of the component is like this:

    component = ParseConfig(config)
    for each signal:
         for each attribute in signal:
              if component.ShouldMatch(attribute):
                   destination_uri = component.UriFor(signal, attribute)
                   ScheduleUploadingInBackground(signal, attribute, destination_uri)
                   ReplaceAttributeWithReference(signal, attribute, destination_uri) 

In terms of the above pseudo-code, I think that the most logical way to represent component.UriFor(...) is by having a string as a pattern containing variables which may refer to properties of the OS environment or properties of the signal.

All general OTTL functions live in pkg/ottl/ottlfuncs

Yes, except that building the list of registered functions exists in transformprocessor; for example, here:

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/internal/traces/functions.go

... it would appear that adding IsRootSpan is done internally to this processor. It would be ideal if logic to construct per-signal function registries were shared, so that everything processing "span" would inherit "IsRootSpan".

How an OTTL context interprets a path is up to that specific context. It is not exposed because contexts are not made to be extendable, they are made to represent a very specific OTLP telemetry structure, based closely on pdata/the otlp proto.

I'm not suggesting that we be able to extend this logic outside the specific context. I am suggesting that this path resolution logic be consumable without higher-level abstractions above it. The ability to resolve a path to a value is, itself, a useful piece of logic that would be useful to be able to access without additional logic around it.

I'm not sure what other use cases besides transform and select that the parser should be considering.

Read-only resolution of path expressions to values is what I'm getting at.

Beyond this, being able to generate strings that interpolate these variables would be handy.

And, beyond that, doing the above with some sort of defaulting mechanism (e.g. to handle attributes that are unset) would be handy.

michaelsafyan commented 2 months ago

Curious to hear any thoughts/updates?

michaelsafyan commented 1 month ago

I've managed to implement this here:

https://github.com/michaelsafyan/open-telemetry.opentelemetry-collector-contrib/blob/blob_writer_span_processor/pkg/ottl/parser.go

... as part of the end-to-end prototyping I was doing for https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33737 .

See:

I will be sending a PR soon to upstream just this piece of the prototype.