open-telemetry / opentelemetry-js-contrib

OpenTelemetry instrumentation for JavaScript modules
https://opentelemetry.io
Apache License 2.0
709 stars 531 forks source link

[detectors-aws] incomplete containerId detected for AWS ECS Fargate #2455

Open Steffen911 opened 2 months ago

Steffen911 commented 2 months ago

What version of OpenTelemetry are you using?

    "@opentelemetry/api": "^1.9.0",
    "@opentelemetry/core": "^1.26.0",
    "@opentelemetry/exporter-trace-otlp-proto": "^0.53.0",
    "@opentelemetry/instrumentation": "^0.53.0",
    "@opentelemetry/instrumentation-aws-sdk": "^0.44.0",
    "@opentelemetry/instrumentation-http": "^0.53.0",
    "@opentelemetry/instrumentation-ioredis": "^0.43.0",
    "@opentelemetry/instrumentation-winston": "^0.40.0",
    "@opentelemetry/resource-detector-aws": "^1.6.1",
    "@opentelemetry/resource-detector-container": "^0.4.1",
    "@opentelemetry/resources": "^1.26.0",
    "@opentelemetry/sdk-node": "^0.53.0",
    "@opentelemetry/sdk-trace-base": "^1.26.0",
    "@opentelemetry/sdk-trace-node": "^1.26.0",
    "@opentelemetry/winston-transport": "^0.6.0",

What version of Node are you using?

20

What did you do?

I initialize my tracing similar to the setup below as one of the first actions within my Node.js express server. I bundle the whole application via Docker and deploy it to AWS ECS on Fargate.

import { NodeSDK } from "@opentelemetry/sdk-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
import { AwsInstrumentation } from "@opentelemetry/instrumentation-aws-sdk";
import {
  envDetector,
  processDetector,
  Resource,
} from "@opentelemetry/resources";
import { awsEcsDetectorSync } from "@opentelemetry/resource-detector-aws";
import { containerDetector } from "@opentelemetry/resource-detector-container";
import { env } from "@/src/env.mjs";

  const sdk = new NodeSDK({
    resource: new Resource({
      "service.name": env.OTEL_SERVICE_NAME,
    }),
    traceExporter: new OTLPTraceExporter({
      url: `${env.OTEL_EXPORTER_OTLP_ENDPOINT}/v1/traces`,
    }),
    instrumentations: [
      new AwsInstrumentation(),
    ],
    resourceDetectors: [
      envDetector,
      processDetector,
      awsEcsDetectorSync,
      containerDetector,
    ],
  });

  sdk.start();
}

What did you expect to see?

I would expect to see something like <taskId>-<containerId> as the container.id in my span attributes. This would fit the conventions that observability vendors like Datadog use. For a task named c23e5f76c09d438aa1824ca4058bdcab I'd expect to see something like c23e5f76c09d438aa1824ca4058bdcab-1234678 for a single container.

What did you see instead?

As part of https://github.com/aws/amazon-ecs-agent/issues/1119 AWS apparently shifted their cgroup naming convention to something like /ecs/<taskId>/<taskId>-<containerId>. Given the 64 character limit it usually cuts off in the middle of taskId. For the example above, this would yield a container.id value like 438aa1824ca4058bdcab/c23e5f76c09d438aa1824ca4058bdcab-1234678.

Is there some way to consistently receive only the part after the / in the cgroup name, i.e. the last chunk? Happy to contribute in case this seems desirable. It should probably follow a regex based approach like in https://github.com/DataDog/dd-trace-js/pull/1176.

Victorsesan commented 1 month ago

@pichlermarc Please assign this to me let me give it a try.

Annosha commented 1 month ago

@Victorsesan are you actively working on it?

Victorsesan commented 1 month ago

@Annosha Nope ,has not yet been assigend to anyone @pichlermarc > @Victorsesan are you actively working on it?

pichlermarc commented 1 month ago

I'll assign @Victorsesan as they were first.

Victorsesan commented 1 month ago

@pichlermarc Thanks for the opportunity man i appreciate, i have gone through datadog trace issue but couldn't get my head around the regex based approach, and pardon my ignorance though I'm a complete novice at this i still want to fix it. I'm not 100% sure if the incomplete containerId is found in the provided setup above or from which of the aws detectors file in the repo is it found? If from the setup above, i have managed to make some few changes which might help fix the issue; I have created a custom resource detector that can processes the cgroup name and extracts the last segment, and for the regex approach which i'm not sure is the desired based approach which matches that of datadog, I have used a regex pattern like this [^/]+$ to match everything after the last / and extract it. @Steffen911 Any pointers you have on how to implement or go about this to help serve your need will also be very much appreciated. TIA