open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.13k stars 2.4k forks source link

[pkg/ottl] Support for data retrieval OTTL expressions #35621

Open lahsivjar opened 2 months ago

lahsivjar commented 2 months ago

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

The ability for components to be able to retrieve data from underlying telemetry using OTTL expressions. As of now, the pkg/ottl provides methods for parsing and evaluating statements and conditions. There seems to be no support for parsing and evaluating paths as well as converters that would define data retrieval expressions. For example:

Allowing retrieval of values using OTTL would help build more generic components. One example use case would be generating metrics from all types of telemetry signals - imagine being able to generate histograms from logs based on attributes as well as spans based on span duration using the same connector. OTTL expressions can be used in this case to extract the data and construct metrics.

Describe the solution you'd like

pkg/ottl to support parsing and evaluating OTTL expressions for data retrieval and conversion.

Describe alternatives you've considered

As a temporary hack, using a get OTTL editor to parse data retrieval expressions as OTTL statements ref

Additional context

No response

github-actions[bot] commented 2 months ago

Pinging code owners:

TylerHelmuth commented 1 month ago

no editor function that allows to retrieve data based on statements.

I am confused as retrieving data is the job of a OTTL path, such as name or attributes. Why is a editor necessary?

lahsivjar commented 1 month ago

@TylerHelmuth you are right, it should be a job of OTTL path. In addition, the solution I propose is also not good. The problem I am trying to solve is to allow a component to accept data retrieval as OTTL expressions which requires the ability to parse and evaluate the data retrieval expressions. Let me rephrase the issue.

lahsivjar commented 1 month ago

@TylerHelmuth Apologies about the half-baked issue before, I have updated the issue and the description now. Let me know if it makes sense.

TylerHelmuth commented 1 month ago

data retrieval expressions

I dont see yet why OTTL is needed for this. A component has access to the data and can manipulate it as needed - it does not need OTTL to provide it access. If you want users to be able to express how the data should be manipulated that sounds like the transformprocessor.

lahsivjar commented 1 month ago

If you want users to be able to express how the data should be manipulated that sounds like the transformprocessor.

The objective is not exactly data manipulation by specific components but rather giving components the ability to extract data from the raw signals using user-defined configuration. While transformprocessor is quite good, I am talking about configuration rather than individual components. I think there is a gap here that limits the components to fully utilize OTTL as a user-defined configuration to extract data. Connectors would probably benefit the most from this feature as they work across signal types.

For a more tangible example, I'm developing a connector that generates metrics from various signal types according to user-defined configurations. These configs consist of OTTL expressions that specify how to extract values from the original signals to form the new metrics — essentially parsing the OTTL values. Maybe it is possible to club together a pipeline that could accomplish something similar, I believe support for parsing and evaluating OTTL values would allow far more flexible components and probably a better-optimized pipeline.

djaglowski commented 1 month ago

A component has access to the data and can manipulate it as needed - it does not need OTTL to provide it access.

Setting data manipulation aside for the moment, the field names and access patterns established by OTTL are effectively the user-facing standard way to refer to values within our telemetry. It makes sense to me that we would want other components to use the same names and expect the same results. The alternative is that component authors invent a variety of ways to refer to and retrieve values, which leads to fragmentation and inconsistency.

michaelsafyan commented 1 month ago

Pointed to this from @evan-bradley as potentially related to:

evan-bradley commented 1 month ago

@TylerHelmuth I agree with what @djaglowski says. We already have this with the filter processor and routing connector, where users can use OTTL conditions to implicitly call a single Editor:

- filter() where IsMatch(...)
- route() where IsString(...)

I think we should also try to enable this case generally for Converters/paths that don't result in a boolean value. Similar to above, this might look like:

- my_custom_component_logic(Format("%x", span_id))

I think there's value in not requiring components to create an undocumented Editor and modify the OTTL statement text to include it, like we had with the filter processor and routing connector prior to providing a Condition parser.

TylerHelmuth commented 1 month ago

I talked with @evan-bradley about this and I think I now understand what is being asked. The goal is to be able to evaluate an OTTL expression, such as start_time - end_time and get the returned value, similar to our existing Condition feature that is using booleanExpression behind-the-scenes.

This feels doable and will require new API similar to Condition. It will likely use value from the grammar as the thing being evaluated but not entirely sure.

If someone wants to take this on the solution will likely follow the pattern Condition uses.