aws-observability / aws-otel-collector

AWS Distro for OpenTelemetry Collector (see ADOT Roadmap at https://github.com/orgs/aws-observability/projects/4)
https://aws-otel.github.io/
Other
572 stars 239 forks source link

Aws X-Ray sampling rules are being ignored #2698

Closed StiviiK closed 3 months ago

StiviiK commented 6 months ago

Describe the bug Hi, i have deployed the AWS-OTEL-COLLECTOR as a sidecar for my regular Java container. The Java container uses the automatic-instrumentation via the javaagent. I wan't to disable sampling for my health check endpoints, so i configured that within the AWS Portal under Cloud Watch -> Settings -> Traces -> X-Ray Sampling rules. (At first i tried to limit to a specific path /actuator/* but as this did nothing i tried limiting it completly) image But traces are still comming in as normal on X-Ray. Also as you can see on the Trend it seems like these rules do nothing at all.

According to the AWS Documentation the otel-collector supports the console sampling configuration. "You can use the AWS X-Ray console to configure sampling rules for your services. The X-Ray SDK and AWS services that support active tracing with sampling configuration use sampling rules to determine which requests to record."

Steps to reproduce I am running ECS Tasks on a EC2 instance (ec2 is not that relevant i guess). Here is a snipped from the most important configurations:

Task configuration:

container_definitions = jsonencode([
    {
      "healthCheck" : {
        "command" : ["/healthcheck"],
        "interval" : 5,
        "timeout" : 6,
        "retries" : 5,
        "startPeriod" : 1
      },
      "command" : [
        # https://aws-otel.github.io/docs/adot-collector-using-ecs#understanding-your-configuration
        "--config=/etc/ecs/ecs-cloudwatch-xray.yaml"
      ],
      "image" : "amazon/aws-otel-collector",
      "name" : "aws-otel-collector"
    },
    {
      "image" : "${var.aws_ecr_host}:${var.image_tag}",
      "links" : [
        "aws-otel-collector" # Can probably remove this when using awsvpc for network_mode
      ],
      [...]
    }
]

Application Dockerfile with auto-instrumentation:

# AWS Distro for OpenTelemetry
ADD --chown=nonroot:nonroot ${AWS_OTEL_SOURCE} /opt/aws-opentelemetry-agent.jar
ENV OTEL_RESOURCE_ATTRIBUTES "service.namespace=WorkflowPlatform,service.name=WFPBackend"
ENV OTEL_EXPORTER_OTLP_ENDPOINT "http://aws-otel-collector:4317"
ENV OTEL_TRACES_EXPORTER "otlp"
ENV OTEL_METRICS_EXPORTER "otlp"
ENV OTEL_TRACES_SAMPLER "xray"
# ENV OTEL_PROPAGATORS "tracecontext,baggage,xray"

# Run the application
ENTRYPOINT ["java", "-javaagent:/opt/aws-opentelemetry-agent.jar", "org.springframework.boot.loader.launch.JarLauncher"]

Environment Collector configuration file: https://github.com/aws-observability/aws-otel-collector/blob/main/config/ecs/ecs-cloudwatch-xray.yaml

StiviiK commented 6 months ago

Update: I have enabled and configured the awsproxy extension but it still seems to get ignored. This is the documentation I followed: https://aws-otel.github.io/docs/getting-started/remote-sampling.
My config is basically the same (as above) but adapted with the extension configuration.

github-actions[bot] commented 4 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] commented 3 months ago

This issue was closed because it has been marked as stale for 30 days with no activity.

StiviiK commented 3 months ago

This is still not resolved. But I am not sure how I can tell the bot to reopen. /reopen