open-telemetry / opentelemetry-js-contrib

OpenTelemetry instrumentation for JavaScript modules
https://opentelemetry.io
Apache License 2.0
697 stars 513 forks source link

[instrumentation-winston] May duplicate transport (and logs) without notice #2283

Closed falsyvalues closed 3 months ago

falsyvalues commented 4 months ago

What version of OpenTelemetry are you using?

    "@opentelemetry/api": "^1.8.0",
    "@opentelemetry/api-logs": "^0.51.1",
    "@opentelemetry/auto-instrumentations-node": "^0.46.0",
    "@opentelemetry/winston-transport": "^0.3.0",

What version of Node are you using?

node --version
v18.20.3

What did you do?

I found an issue while testing instrumentation-winston which may lead to duplicated logs (via duplicated transports). To have logger working with logs exporter we may be tempted to follow [winston-transport/README.md#usage] (https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/packages/winston-transport/README.md#usage) guide, extended with exporter configuration (posted also in #2090) e.g.

import * as api from "@opentelemetry/api-logs";
import { OTLPLogExporter } from "@opentelemetry/exporter-logs-otlp-http";
import {
  detectResourcesSync,
  envDetectorSync,
  hostDetectorSync,
  processDetectorSync,
} from "@opentelemetry/resources";
import {
  BatchLogRecordProcessor,
  LoggerProvider
} from "@opentelemetry/sdk-logs";
import { OpenTelemetryTransportV3 } from "@opentelemetry/winston-transport";

import * as winston from "winston";

const logExporter = new OTLPLogExporter();
const loggerProvider = new LoggerProvider({
  resource: detectResourcesSync({
    detectors: [envDetectorSync, processDetectorSync, hostDetectorSync],
  }),
});

loggerProvider.addLogRecordProcessor(
  new BatchLogRecordProcessor(logExporter)
);
api.logs.setGlobalLoggerProvider(loggerProvider);

export const logger = winston.createLogger({
  level: "info",
  transports: [
    new winston.transports.Console(),
    new OpenTelemetryTransportV3(),
  ],
});

logger.info('Hello world');

And that will already cause a problem combined with instrumentation-winston because transport aren't deduplicated and instrumentation.ts#L220 adds it without deduplication even if transport is already there. Which in the end leads to duplicated logs in collector / APM.

What did you expect to see?

I expect that I won't have duplicated logs at the end.

What did you see instead?

I observed duplicated logs because instrumentation added another OpenTelemetryTransportV3 to list of transports in winston instance.

Additional context

Instrumentation code that adds transport:

const openTelemetryTransport = new OpenTelemetryTransportV3(
  transportOptions
);
if (originalTransports && !Array.isArray(originalTransports)) {
  newTransports = [originalTransports];
}
newTransports.push(openTelemetryTransport);
args[0].transports = newTransports;

instrumentation.ts#L220

BradPlayerZero commented 4 months ago

Not totally sure the root cause here, but I've found removing the new OpenTelemetryTransportV3() from the createLogger seems to fix the duplicates.

trentm commented 4 months ago

I wonder if the Bunyan instrumentation effectively will do the same.

trentm commented 4 months ago

All, I wonder if part of the issue here (and on #2090) is confusion over the intent of how to get log-sending working with Winston. The intent is that one can do either of the following:

  1. Use WinstonInstrumentation from @opentelemetry/instrumentation-winston, which is intended to automatically add a OpenTelemetryTransportV3 instance to created Winston loggers; or
  2. Manually add a OpenTelemetryTransportV3 instance (from @opentelemetry/winston-transport) to Winston loggers in the createLogger call.

In the original example here, both are being done, hence the duplicate transports.

Two options for avoiding this:

A. Improve the instrumentation-winston documentation to make this clear.

Here is the equivalent documentation discussion for instrumentation-bunyan. Is that clearer for Bunyan users? My guess is the confusion comes from the @opentelemetry/winston-transport README. There is a sentence in the Usage section saying one doesn't need to use this directly if using instrumentation-winston. Should that be in bold or somehow much more prominent?

This comment predicted some possible confusion.

B. Consider having instrumentation-winston (by default, or optionally) skip adding OpenTelemetryTransportV3 if it looks like there is already one there.

I would initially be hesitant to do that. It gets into the realm of DWIM'ing that can lead to subtle surprises. instrumentation-bunyan currently does not do this skip-on-possible-duplicate handling. The coming instrumentation-pino support for log sending won't be able to do this.

Note that using isinstance OpenTelemetryTransportV3 isn't necessarily reliable, because there could be separate installations of @opentelemetry/winston-transport in the install tree.

falsyvalues commented 4 months ago

Thanks @trentm

All, I wonder if part of the issue here (and on #2090) is confusion over the intent of how to get log-sending working with Winston. The intent is that one can do either of the following:

  1. Use WinstonInstrumentation from @opentelemetry/instrumentation-winston, which is intended to automatically add a OpenTelemetryTransportV3 instance to created Winston loggers; or
  2. Manually add a OpenTelemetryTransportV3 instance (from @opentelemetry/winston-transport) to Winston loggers in the createLogger call.

In the original example here, both are being done, hence the duplicate transports.

Will it be more confusing when I will add to that if we won't pass transports to winston, we will get an error log:

[winston] Attempt to write logs with no transports, which can increase memory usage:

Two options for avoiding this:

A. Improve the instrumentation-winston documentation to make this clear.

Here is the equivalent documentation discussion for instrumentation-bunyan. Is that clearer for Bunyan users? My guess is the confusion comes from the @opentelemetry/winston-transport README. There is a sentence in the Usage section saying one doesn't need to use this directly if using instrumentation-winston. Should that be in bold or somehow much more prominent?

It seems to me that the structure of the description does not help in this case:

If using @opentelemetry/instrumenation-winston package there is no need to instantiate the transport as the instrumentation will take care of that. For example:

And then, there is an example where winston-transport is being used. Worth to adjust wording, formatting and maybe example - add comment in the code?

This comment predicted some possible confusion.

B. Consider having instrumentation-winston (by default, or optionally) skip adding OpenTelemetryTransportV3 if it looks like there is already one there.

I would initially be hesitant to do that. It gets into the realm of DWIM'ing that can lead to subtle surprises. instrumentation-bunyan currently does not do this skip-on-possible-duplicate handling. The coming instrumentation-pino support for log sending won't be able to do this.

If that won't be the case then maybe we should have more examples e.g. manual instrumentation that pretty much replicates @opentelemetry/auto-instrumentations-node behaviour with more granular control (this is how I understand SDK goal). Not having to repeat stuff like resource collecting etc.

const loggerProvider = new LoggerProvider({
  resource: detectResourcesSync({
    detectors: [envDetectorSync, processDetectorSync, hostDetectorSync],
  }),
});

Note that using isinstance OpenTelemetryTransportV3 isn't necessarily reliable, because there could be separate installations of @opentelemetry/winston-transport in the install tree.

Thats one things but realms CJS/ESM is another problem but solvable I think.