Open TylerHelmuth opened 1 year ago
Pinging code owners for processor/metricsgeneration: @Aneurysm9. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for processor/logstransform: @djaglowski @dehaansa. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for processor/attributes: @boostchicken. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for processor/metricstransform: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for processor/resource: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for processor/span: @boostchicken. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for processor/redaction: @leonsp-ai @dmitryax @mx-psi. See Adding Labels via Comments if you do not have permissions to add labels yourself.
Pinging code owners for receiver/hostmetrics: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Over the last year we've see increased interest and activity with the redaction processor. We've also pointed to it directly when answering questions about how the collector handles keeping data safe. It is clearer now that having a targeted processor specifically for the use case of redacting data is useful from a governance and practicality standpoint. For this reason the redaction processor meets criteria 4 and should therefore not be replaced by the transformprocessor. I've removed it from the list.
Instead, the redactionprocessor could utilize OTTL if it ever needs more complex conditionals for when to redact.
Component(s)
pkg/ottl
Describe the issue you're reporting
Background
The OpenTelemetry Transformation Language (OTTL) is a language for transforming open telemetry data. Its primary use case is for the transformprocessor but can be used in any component.
Although OTTL's primary goal is to facilitate transforming telemetry, it's Condition logic is also useful in isolation. Since OTTL Conditions have access to functions and telemetry fields as well, it provides an all-encompassing solution for making decisions based on telemetry field values.
Since its inception, OTTL has started being used in more components in Contrib. As of writing this issue it is used in
transformprocessor
,routingprocessor
, andfilterprocessor
. Currently bothtransformprocessor
androutingprocessor
take advantage of full OTTL statements, whereasfilterprocessor
only utilizes conditions.When discussing the roadmap for OTTL in Contrib, there are 2 main focuses: what components can be replaced with the transformprocessor and how can OTTL conditions be used to standardize
internal/filter
and the components that use those packages.OTTL and the Transform Processor The transform processor with OTTL provides an open opportunity for most stateless transformations of data. There is opportunity for, and already a lot of, overlap with other components. How should the Contrib repository handle these overlaps?
I propose utilizing the transform processor to reduce the number of components we need in Contrib, standardizing how data is transformed in the Collector. If users want to transform their data in the collector then I propose the
transformprocessor
as the "one-stop-shop" for those needs. That said, there are some guardrails I think we should follow:transformprocessor
is to transform telemetry.transformprocessor
is stateless; it should not handle any stateful transformations.transfromprocessor
should not rely on any external source for its transformations. This means it should not need to call out to any APIs or databases.transformprocessor
is not a replacement for hyper-targeted processors (unless they want it to be). Processors likefilterprocessor
, vendor-specific processors, samplers, feel like bad candidates for thetransformprocessor
. (This guardrail is kinda "feely" and definitely needs discussed)With those guardrails in place, I see these components as candidates for replacement by the
transformprocessor
. (list is in alphabetical order only)attributesprocessor
logstransformprocessor
metricsgenerationprocessor
metricstransformprocessor
- this processor is currently stateless and therefore meets our guardrails. Whether or not it should be stateless is a separate debateresourceprocessor
spanprocessor
Before any of these processor could be replaced work needs to be done to ensure the transformprocessor has complete feature parity. I believe a declarative syntax option will also be necessary.
OTTL as a Generic Condition Solution A major strength of OTTL is that is has conditions built into its grammar. These conditions have access to every field in the OTLP proto for each signal which means that users have no restrictions on what they choose to use in their conditions. By using Converter functions users are able to create complex conditions.
Thanks to https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/16413 OTTL has the capabilities of all other filter options in
internal/filter
. I propose we standardize conditions in Contrib on OTTL, updating all components that useinternal/filter
to use OTTL.By unifying on OTTL we'll have a solution that has access to all fields on all signals. Users no longer need to worry about whether or not a field for their signal is available to use and maintainers no longer need to worry about adding more fields to filter on in the future (OTLP changes excluded). Due to OTTL's functions, adding more features to enable complex conditions is simpler as the functions encapsulate the logic and can be added without modifying the underlying libraries or configuration. On top of its field access and functions, OTTL's grammar also provides more robust conditions, allowing users to use inequalities, nil, and arithmetic.
For maintainers will allow us to reduce the amount of code we need to maintain.
Components that use
internal/filter
(that aren't listed in the above replacement candidate list): (list is in alphabetical order only)cumulativetodeltaprocessor
filterprocessor
(https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/18642)hostmetricsreceiver
Before non-OTTL packages in
internal/filter
could be replaced we should consider a Condition-specific parser and a reusable configuration for defining conditions.Related beta issue for OTTL: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/28892 Related beta issue for the transform processor: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/28644