winstonjs / winston

A logger for just about everything.
http://github.com/winstonjs/winston
MIT License
22.64k stars 1.8k forks source link

[Bug]: Formatter runs on log call, even if log is not correct level #2441

Open cerpins opened 5 months ago

cerpins commented 5 months ago

🔎 Search Terms

Slow formatter

The problem

My application has formatters, which are slow. When I run my application with level=info to get rid of debug logs in production, I notice that the formatter is ran for those debug logs, causing a considerable performance hit - even though level is info.

Please let me know if there is a way to work around this, or am I just misunderstanding winston and formatters. Thanks!

Here is my profiler: image

Here is my minimal-ish reproducible:

const winston = require('winston');

const logger = winston.createLogger({
  level: 'info',
  format: winston.format.combine(
    winston.format.json(),
    winston.format((input) => {
      let lagger = 0;
      for (let i = 0; i < 1000000000; i++) {
        lagger += 1;
      }
      return input;
    })()
  ),
  defaultMeta: { service: 'user-service' },
  transports: [
    new winston.transports.Console({
      format: winston.format.simple(),
    })
  ],
});

setInterval(() => {
  // I still see formatter for this log in profiler taking up some seconds
  logger.debug('Hello world!');
}, 1000);

What version of Winston presents the issue?

v3.13.0

What version of Node are you using?

v18.18.0

If this worked in a previous version of Winston, which was it?

No response

Minimum Working Example

Posted above

Additional information

No response

DanKaplanSES commented 2 months ago

I can confirm this. Here is my code (typescript):

import { parse } from 'stacktrace-parser';
import { format } from 'winston';
import { basename } from 'node:path';

export const logPrefix = format((info, opts) => {
  console.log(`here!!`);
  const stack = new Error().stack;
  const stackFrames = parse(stack);
  const logFrameIndex =
    stackFrames.findIndex((stackFrame) => {
      return stackFrame.file.endsWith(`create-logger.js`);
    }) + 1;

  if (logFrameIndex === 0) {
    info.message = `unknown: ${info.message}`;
  } else {
    const logFrame = stackFrames[logFrameIndex];
    let prefix = new Date().toISOString();
    prefix += ' ' + basename(logFrame.file);
    prefix += `#${logFrame.lineNumber}`;
    info.message = `${prefix}: ${info.message}`;
  }

  return info;
});

This is how I use the formatter:

import { createLogger, Logger, format, transports } from 'winston';
import { logPrefix } from './log-prefix';

export type Category =
  | 'DEFAULT'
  | 'another category'
  | 'even another category';

const logLevels = ['error', 'warn', 'info', 'debug'] as const;
export type LogLevel = (typeof logLevels)[number];

const loggerPool = new Map<LogLevel, Logger>([
  ['debug', createLogger(buildLoggerOptions('debug'))],
  ['info', createLogger(buildLoggerOptions('info'))],
  ['warn', createLogger(buildLoggerOptions('warn'))],
  ['error', createLogger(buildLoggerOptions('error'))],
]);

function buildLoggerOptions(configuredLevel: LogLevel) {
  return {
    level: configuredLevel,
    format: format.combine(logPrefix(), format.simple()),
    transports: new transports.Console(),
    exceptionHandlers: new transports.Console(),
    rejectionHandlers: new transports.Console(),
    exitOnError: false,
  };
}

export const getLogger = (categoryName: Category = 'DEFAULT'): Logger => {
  if (getEnvLevelValue('DEFAULT') === undefined) {
    throw new Error(
      `Logger Configuration Error: Environment Variable '${getEnvLevelKey(
        'DEFAULT'
      )}' does not exist.`
    );
  }

  const configuredLevel = getEnvLevelValue(categoryName);
  if (configuredLevel === undefined) {
    getLogger('DEFAULT').warn(
      `Environment Variable '${getEnvLevelKey(
        categoryName
      )}' does not exist. Defaulting to 'debug'.`
    );
  }

  return loggerPool.get(configuredLevel ?? 'debug');
};

const getEnvLevelKey = (categoryName: Category): string => {
  return `logger.${categoryName}.level`;
};

const getEnvLevelValue = (categoryName: Category): LogLevel | undefined => {
  const value = process.env[getEnvLevelKey(categoryName)];

  if (value === undefined) {
    return undefined;
  } else if (includes(logLevels, value)) {
    return value;
  }
  throw new Error(
    `Invalid value for getEnvLevelKey(categoryName): '${getEnvLevelKey(
      categoryName
    )}' Expected one of ${logLevels}`
  );
};

function includes<S extends string>(
  haystack: readonly S[],
  needle: string
): needle is S {
  const _haystack: readonly string[] = haystack;
  return _haystack.includes(needle);
}

That console.log(`here!!`); runs 10s/100s of times without a single winston log printing. I'm not sure if this is a bug. It could be:

  1. The two of us are making a similar mistake and there's nothing wrong with winston. In this case, I think some documentation may be lacking.
  2. This is by design, in which case, it seems like low hanging fruit to increase winston's performance.
  3. This is a bug. I tested this in winston 3.12.0 and winston 3.13.0, so it's not brand-new.

By the way, the logform docs hint at a workaround: if my formatter returned falsy before the expensive computation, the performance issue would be fixed in my formatter. I just don't know how to know when I should be returning falsy because the formatter API doesn't provide the information I need: the log level of the logger.

I want to point out that this would be a workaround and not a fix: the formatters I did not create would still be called whether they needed to or not.

DanKaplanSES commented 2 months ago

As a workaround, I figured out how to pass the current logger level into the formatter:

import { parse } from 'stacktrace-parser';
import { format, config } from 'winston'; // changed
import { TransformFunction } from 'logform';
import { basename } from 'node:path';

const levelToNumber = config.npm.levels; // new

const transform: TransformFunction = (info, opts) => {
  if (opts?.loggerLevel !== undefined) { // new
    if (levelToNumber[opts?.loggerLevel] < levelToNumber[info.level]) { // new
      return false; // new
    } // new
  } // new

  const stack = new Error().stack;
  const stackFrames = parse(stack);
  const logFrameIndex =
    stackFrames.findIndex((stackFrame) => {
      return stackFrame.file.endsWith(`create-logger.js`);
    }) + 1;

  if (logFrameIndex === 0) {
    info.message = `unknown: ${info.message}`;
  } else {
    const logFrame = stackFrames[logFrameIndex];
    let prefix = new Date().toISOString();
    prefix += ' ' + basename(logFrame.file);
    prefix += `#${logFrame.lineNumber}`;
    info.message = `${prefix}: ${info.message}`;
  }

  return info;
};

export const logPrefix = format(transform);

This is how I use it:

import { createLogger, Logger, format, transports } from 'winston';
import { logPrefix } from './log-prefix';

export type Category =
  | 'DEFAULT'
  | 'another category'
  | 'even another category';

const logLevels = ['error', 'warn', 'info', 'debug'] as const;
export type LogLevel = (typeof logLevels)[number];

const loggerPool = new Map<LogLevel, Logger>([
    ['debug', createLogger(buildLoggerOptions('debug'))], // changed
    ['info', createLogger(buildLoggerOptions('info'))], // changed
    ['warn', createLogger(buildLoggerOptions('warn'))], // changed
    ['error', createLogger(buildLoggerOptions('error'))], // changed
]);

function buildLoggerOptions(configuredLevel: LogLevel) {
  return {
    level: configuredLevel,
    format: format.combine(logPrefix({ loggerLevel: configuredLevel }), format.simple()),  // changed
    transports: new transports.Console(),
    exceptionHandlers: new transports.Console(),
    rejectionHandlers: new transports.Console(),
    exitOnError: false,
  };
}

export const getLogger = (categoryName: Category = 'DEFAULT'): Logger => {
  if (getEnvLevelValue('DEFAULT') === undefined) {
    throw new Error(
      `Logger Configuration Error: Environment Variable '${getEnvLevelKey(
        'DEFAULT'
      )}' does not exist.`
    );
  }

  const configuredLevel = getEnvLevelValue(categoryName);
  if (configuredLevel === undefined) {
    getLogger('DEFAULT').warn(
      `Environment Variable '${getEnvLevelKey(
        categoryName
      )}' does not exist. Defaulting to 'debug'.`
    );
  }

  return loggerPool.get(configuredLevel ?? 'debug');
};

const getEnvLevelKey = (categoryName: Category): string => {
  return `logger.${categoryName}.level`;
};

const getEnvLevelValue = (categoryName: Category): LogLevel | undefined => {
  const value = process.env[getEnvLevelKey(categoryName)];

  if (value === undefined) {
    return undefined;
  } else if (includes(logLevels, value)) {
    return value;
  }
  throw new Error(
    `Invalid value for getEnvLevelKey(categoryName): '${getEnvLevelKey(
      categoryName
    )}' Expected one of ${logLevels}`
  );
};

function includes<S extends string>(
  haystack: readonly S[],
  needle: string
): needle is S {
  const _haystack: readonly string[] = haystack;
  return _haystack.includes(needle);
}
DanKaplanSES commented 1 month ago

Sorry for the ping, @DABH , but this affects the performance of every winston user, so I'd like to get it triaged sooner than later. Thanks!

DABH commented 1 month ago

Howdy! This seems like a great thing to investigate. I believe your reports that formatters are getting called even if the item is ultimately not being logged, which is obviously a glaring perf issue that would be wonderful to fix. Ideally we could fix this at the winston not logform level so that we don't have to pass an extra object into every formatter call (which would make the abstraction kind of leaky as well).

https://github.com/winstonjs/winston/blob/17192754733c40d261d76b690d84d4d2608c05f1/lib/winston/logger.js#L308-L313 makes it sound like the intent was indeed to only call format when an item is actually being logged, but I guess that's not how things worked out (or I'm reading too much into that).

Filtering for level happens at the winston-transport level https://github.com/winstonjs/winston-transport/blob/master/modern.js#L80 not the winston level though.

Hmm but so does transformation/formatting https://github.com/winstonjs/winston-transport/blob/master/modern.js#L91.

But seems like that level filtering doesn't happen when you call log on the logger vs. writing directly to a transport https://github.com/winstonjs/winston/blob/17192754733c40d261d76b690d84d4d2608c05f1/lib/winston/logger.js#L207 - log just writes to each of the transports and then the transports each do their own thing.

[whoops left this draft unfinished for a few days]

I think my conclusion so far is that transports filter early based on log level, but the logger does not. If you configure a log level(s) on your transport rather than the logger, do things work better for you? We could add some similar short-circuiting code the logger itself, because I think the logger's level is supposed to override whatever levels the transports set?? But I'm trying to think about why that code would not be in there already, it's possible there's some edge case I'm missing around child loggers, if transports should have higher precedence than loggers, or some bad performance hit, or? (or maybe this is just some glaring oversight in the logger class?)