open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.3k stars 1.42k forks source link

Pipeline unit test facility #9709

Open jmcarp opened 6 months ago

jmcarp commented 6 months ago

Is your feature request related to a problem? Please describe.

As far as I can tell, the only way to verify the behavior of an otelcol pipeline is to run the collector, wait for data to pass through the pipeline, and manually verify that telemetry is showing up in the export destinations. This isn't ideal for more complicated pipelines, since we effectively have to manually test every output on every change.

Describe the solution you'd like

It would be great to see some kind of unit test functionality that exercises some or all of an otelcol pipeline using synthetic data and making assertions about data that we expect to receive at different points in the pipeline. Other telemetry agents have this capacity. For example, vector has a unit test feature that allows users to make assertions about individual transforms or pipelines, and fluent-bit has an "expect" filter that's meant for unit-testing pipelines as well.

Maybe a mock receiver that sends synthetic data and a mock exporter that checks assertions on the data it receives would work here. Do those exist already? I couldn't find them.

Describe alternatives you've considered

It's possible to test otelcol pipelines manually, which works, but is slow and fallible. Users could also write end-to-end tests of their system, making assertions that logs arrive in a test environment. This is probably a good idea, but also high-effort; it would still be useful to have some facility for quick pipeline unit tests.

TylerHelmuth commented 6 months ago

@jmcarp are you asking for this feature as a user of the collector binary or as a user of the collector libraries? Does testbed fit your needs at all?

jmcarp commented 6 months ago

@TylerHelmuth the former—I want to write a collector config file and make assertions about its behavior, either with synthetic or real inputs. Is the testbed meant for this use case? From the docs, it seems more focused on developing the collector itself, but maybe I can repurpose it for my use case. Can it load a collector config file, receive data, and make assertions about the outputs of a pipeline?