aws-observability / aws-otel-collector

AWS Distro for OpenTelemetry Collector (see ADOT Roadmap at https://github.com/orgs/aws-observability/projects/4)
https://aws-otel.github.io/
Other
567 stars 237 forks source link

Can't filter out noisy events from Lambda due to outdated OTel collector version (v0.39.0, last released 3 years ago) #2750

Open kmcquade opened 3 months ago

kmcquade commented 3 months ago

Describe the question

TLDR: Hi, I'm trying to filter out noisy spans using the filterprocessor + ottl in the collector config. It appears that I'm not able to do that because the collector version used in that Lambda layer is v0.39.0, which was last released 3 years ago. Filterprocessor requires at least 0.66.0, according to these Honeycomb docs. In terms of my ask - it would be fantastic if AWS could release a version of the AWS Lambda Layer that would support filterprocessor.

I created a demo to make it as easy to reproduce the issue: https://github.com/kmcquade/otel-lambda-failure-example. I hope this helps. Just clone it and run these commands:

git clone https://github.com/kmcquade/otel-lambda-failure-example.git
cd otel-failure-example
make deploy
make invoke
# See the README there for what to look for in the failed invocation

# Delete the stack
make delete

Use case

I am trying to filter out noisy spans using the filterprocessor + ottl on the collector side. We enabled Boto instrumentation in OpenTelemetry but our flame graph is getting littered with a lot of requests. 99% of the time, it is too much information, but sometimes it is helpful to include a span about the AWS API calls. For example, if a call to S3 takes over a certain amount of seconds, or if there's downtime for an AWS API that affected a call.

Here's an example of noisy spans that I want to be able to filter out:

image

I tried modifying my Lambda Otel collector config to use this:

processors:
  filter:
    traces:
      spanevent:
        - 'exists(attributes["error"]) == true and IsMatch(name, ".*DynamoDB.*") == true'
        - 'exists(attributes["error"]) == true and IsMatch(name, ".*STS.*") == true'
        - 'exists(attributes["error"]) == true and IsMatch(name, ".*S3.*") == true'

But I get an Extension.Crash in the Lambda function with a message like this:

"failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:\n\n* error decoding 'processors': unknown type: \"filter\" for id: \"filter\" (valid values: [])"

Steps to reproduce if your question is related to an action

See this GitHub repository that I created to demonstrate this issue: https://github.com/kmcquade/otel-lambda-failure-example

What did you expect to see?

Environment

Describe any aspect of your environment.

See the example repository mentioned above.

If this is related to a deployment of the ADOT Collector please provide your Collector config file.

See the example repository mentioned above.

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

kmcquade commented 1 month ago

Commenting so the GitHub bot doesn't close it 😅