open-telemetry / opentelemetry-js

OpenTelemetry JavaScript Client
https://opentelemetry.io
Apache License 2.0
2.75k stars 808 forks source link

@opentelemetry/instrumentation-http - tracing of incoming/outgoing requests stopped working #5001

Closed peterabbott closed 1 month ago

peterabbott commented 1 month ago

What happened?

Steps to Reproduce

When enabled, incoming and outgoing requests are automatically traced without having to manually create an active span.

Expected Result

an active span is available when making an http request

Actual Result

no active span available

Additional Details

The follow on from this, where we spotted, the winston instrumentation stopped adding the trace_id to the log messages because there was no active span. Have confirmed downgrading to 0.52 restores expected behaviour. Playing with the new disable incoming/output properties in 0.53 do nothing to change behaviour.

Related, but not specific to this ticket which is specific to Remix

The spans we receive from the system all start at downstream from out API (http) services, for example Lambda and ECS. Rolling back to 0.52 the API endpoints show up.

OpenTelemetry Setup Code

const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http');
const {
  ConsoleSpanExporter,
  NodeTracerProvider,
  SimpleSpanProcessor,
} = require('@opentelemetry/sdk-trace-node');
const { registerInstrumentations } = require('@opentelemetry/instrumentation');

const provider = new NodeTracerProvider();

provider.addSpanProcessor(new SimpleSpanProcessor(new ConsoleSpanExporter()));
provider.register();

registerInstrumentations({
  instrumentations: [new HttpInstrumentation()],
});

package.json

// relevant bits from package.json
{
  "engines": {
    "node": "20.17.0"
  },
  "dependencies": {
...
    "@opentelemetry/api": "1.7.0",
    "@opentelemetry/exporter-trace-otlp-grpc": "0.53.0",
    "@opentelemetry/id-generator-aws-xray": "1.2.2",
    "@opentelemetry/instrumentation": "0.53.0",
    "@opentelemetry/instrumentation-aws-lambda": "0.44.0",
    "@opentelemetry/instrumentation-aws-sdk": "0.44.0",
    "@opentelemetry/instrumentation-dns": "0.39.0",
    "@opentelemetry/instrumentation-generic-pool": "0.39.0",
    "@opentelemetry/instrumentation-graphql": "0.43.0",
    "@opentelemetry/instrumentation-grpc": "0.53.0",
    "@opentelemetry/instrumentation-http": "0.52.0",
    "@opentelemetry/instrumentation-ioredis": "0.43.0",
    "@opentelemetry/instrumentation-koa": "0.43.0",
    "@opentelemetry/instrumentation-net": "0.39.0",
    "@opentelemetry/instrumentation-pg": "0.44.0",
    "@opentelemetry/instrumentation-winston": "0.40.0",
    "@opentelemetry/otlp-exporter-base": "0.53.0",
    "@opentelemetry/propagator-aws-xray": "1.26.0",
    "@opentelemetry/resource-detector-aws": "1.6.1",
    "@opentelemetry/resources": "1.26.0",
    "@opentelemetry/sdk-metrics": "1.26.0",
    "@opentelemetry/sdk-trace-base": "1.26.0",
    "@opentelemetry/sdk-trace-node": "1.26.0",
    "@opentelemetry/semantic-conventions": "1.27.0",
....
    "opentelemetry-instrumentation-ws": "0.5.0",
....
    "winston": "3.14.2"
}

Relevant log output

with 0.52 we get active spans, with winston logs 
{ "trace_id": "12345", "span_id": "4567", .... }

With 0.53, no active span is created via the http-instrumentation so get nothing in the logs.
pichlermarc commented 1 month ago

@peterabbott thanks for reaching out. I've tried to reproduce this but was not able to get the same behavior you're describing.

My attempt uses the node http module to run a very simple server and winston for logging. When sending a request to that server, and logging in the handler, the span_id and trace_id get attached properly. (see https://github.com/pichlermarc/repro-5001)

I think I may be misunderstanding where you'd expect the context to be active so that it can be attached to the winston logs. Could you point me in the right direction?

peterabbott commented 1 month ago

@pichlermarc sorry, missed your question.

From what I can tell when using the latest version it is not hitting here, so either the function is not getting wrapped or is baling out here. In bumping the version we don't change anything else, just the version. I'll have a go at putting together a test to demonstrate.

peterabbott commented 1 month ago

@pichlermarc with that demo project I was not able to recreate with that demo project you created. There must be something else going on then. I have seen in some cases issues with the ordering of the instrumentation registration. Maybe something in that. I'll close for now, do some more debugging on my full app to see what is going on. If I recreate then reopen if needed