winstonjs / winston-daily-rotate-file

A transport for winston which logs to a rotating file each day.
MIT License
889 stars 151 forks source link

On rotation the new file's owner is not always the same. Sometimes it is the user, sometimes it is root #374

Open JasonGloverNZ opened 1 year ago

JasonGloverNZ commented 1 year ago

I'm running into a frustrating problem that was a little hard to track down. I have a background process that was trouble-free until a few weeks ago and then suddenly it developed the following error...

EACCES: permission denied, open '../logs/textract-prod/default-2023-06-07.log'

Once this error occurs the process enters a failed state. It is controlled by pm2, which will restart it, but as soon as it tries to log to the file (which it does on startup) it encounters the permission problem and it crashes, and gets restarted again.

I found that I could temporarily fix the problem by stopping the process and renaming the logging directory. When I restart the process it creates a new log file which has the correct permissions. I should mention at this point that I am running the process in user space, and the log directory is in the user's home directory.

User: bigbird Log directory: /home/bigbird/logs/textract-prod/ OS: Ubuntu 18.04.6 LTS

Looking at the logs I noticed a pattern: the file permission problem always starts at midnight UTC. Which led me to the log file rotator.

I confirmed this theory by checking the file permissions of a set of files:

-rw-r--r-- 1 bigbird bigbird    12782 Jun  1 00:00 default-2023-05-31.log.gz
-rw-r--r-- 1 root   root     170817 Jun  2 00:00 default-2023-06-01.log.gz
-rw-r--r-- 1 bigbird bigbird     5983 Jun  3 00:00 default-2023-06-02.log.gz
-rw-r--r-- 1 root   root       8034 Jun  4 00:00 default-2023-06-03.log.gz
-rw-r--r-- 1 bigbird bigbird     3349 Jun  5 00:00 default-2023-06-04.log.gz
-rw-r--r-- 1 root   root     144375 Jun  6 00:00 default-2023-06-05.log.gz
-rw-r--r-- 1 root   root       3332 Jun  7 00:00 default-2023-06-06.log.gz
-rw-r--r-- 1 root   root     359175 Jun  7 20:52 default-2023-06-07.log
-rw-r--r-- 1 bigbird bigbird       20 Jun  2 00:00 exceptions-2023-05-31.log.gz
-rw-r--r-- 1 bigbird bigbird 23026509 Jun  2 23:59 exceptions-2023-06-02.log
-rw-r--r-- 1 bigbird bigbird       20 Jun  4 00:00 exceptions-2023-06-03.log.gz
-rw-r--r-- 1 bigbird bigbird   761161 Jun  5 00:00 exceptions-2023-06-04.log.gz
-rw-r--r-- 1 bigbird bigbird      505 Jun  6 00:00 exceptions-2023-06-05.log.gz
-rw-r--r-- 1 bigbird bigbird 23029489 Jun  6 23:59 exceptions-2023-06-06.log
-rw-r--r-- 1 bigbird bigbird       10 Jun  7 00:00 exceptions-2023-06-06.log.gz
-rw-r--r-- 1 bigbird bigbird 18453511 Jun  7 19:13 exceptions-2023-06-07.log

So straight away I see the rotator has a random chance of creating the new log file with the current user as the owner, but sometimes root is the owner.

What the ____ ?!

The only "odd" thing that we do in this process is we spawn some python processes. We capture the stdout and stderr from those processes in the parent process and write that to the log. I'm wondering if that has something to do with why we sometimes get a root-owner log file.

const child_process = require("child_process");
const defaultLogger = require("./logger");

const pythonRunner = {
  spawnPython3Process : async(pythonScript, processCounter, ...args) => {
    try{
      args.unshift(pythonScript); // Add the script to the beginning of the array.
      defaultLogger.info(`parallel processes = ${++processCounter.count}`);
      const pyProcess = child_process.spawn("python3", args);
      pyProcess.stdout.on("data", function (data) {
        const output = data.toString();
        if (output.startsWith('INFO:'))
          defaultLogger.info(`python > ${output.substring(5).trim()}`);
        else if (output.startsWith('ERROR:'))
          defaultLogger.error(`python > ${output.substring(6).trim()}`);
        else if (output.startsWith('WARN:'))
          defaultLogger.error(`python > ${output.substring(5).trim()}`);
      });
      pyProcess.stderr.on("data", function (data) {
        defaultLogger.error(`python > ${data.toString()}`);
      });
      pyProcess.on('close', (code) => {
        defaultLogger.info(`parallel processes = ${--processCounter.count}`);
      });
    }
    catch(error)
    {
      defaultLogger.error(`Fatal error spawning python3 script => ${pythonScript}`);
    }
  }
};

module.exports = pythonRunner;

The reason I don't think this is the problem:

  1. We are doing the logging back in the main process, which is running in user space.
  2. child_process.spawn should be creating processes in the same user space as the parent process.

I also note that we don't get this problem on other environments, OSes and architectures.

JasonGloverNZ commented 1 year ago

UPDATE: I had a theory that maybe the spawned processes were running under a different user space and that was causing the rotated files to be created with root as the owner.

I tested that by emitting the uid and gid both in the parent process, but also in the stdout and stderr event handlers.

Same values.

domingguss commented 11 months ago

following.

We also have this issue, though we are using systemctl to restart the node application.

Sep 28 19:37:53 raspberry systemd[1]: Started Frank.
Sep 28 19:37:54 raspberry start.sh[3653]: Added RotateFileTransport
Sep 28 19:37:54 raspberry start.sh[3653]: Error: EACCES: permission denied, open 'logs/frank-2023-09-28-19.log'
Sep 28 19:37:55 raspberry systemd[1]: frank.service: Main process exited, code=exited, status=1/FAILURE
Sep 28 19:37:55 raspberry systemd[1]: frank.service: Failed with result 'exit-code'.
Sep 28 19:37:55 raspberry systemd[1]: frank.service: Service RestartSec=100ms expired, scheduling restart.
Sep 28 19:37:55 raspberry systemd[1]: frank.service: Scheduled restart job, restart counter is at 5.
Sep 28 19:37:55 raspberry systemd[1]: Stopped Frank.
Sep 28 19:37:55 raspberry systemd[1]: frank.service: Start request repeated too quickly.
Sep 28 19:37:55 raspberry systemd[1]: frank.service: Failed with result 'exit-code'.
Sep 28 19:37:55 raspberry systemd[1]: Failed to start Frank.
Sep 28 19:38:11 raspberry systemd[1]: Started Frank.
-rw-r--r-- 1 pi   pi    30K Sep 28 16:00 frank-2023-09-28-15.log.gz
-rw-r--r-- 1 pi   pi    17K Sep 28 17:00 frank-2023-09-28-16.log.gz
-rw-r--r-- 1 pi   pi    39K Sep 28 18:00 frank-2023-09-28-17.log.gz
-rw-r--r-- 1 root root  17K Sep 28 19:00 frank-2023-09-28-18.log.gz
-rw-r--r-- 1 root root 179K Sep 28 19:34 frank-2023-09-28-19.log
-rw-r--r-- 1 pi   pi   7.2K Sep 29 00:00 frank-2023-09-28.log.gz
-rw-r--r-- 1 pi   pi   1.9M Sep 29 09:52 frank-2023-09-29.log

We also have a python script doing some stuff

import { spawn } from 'child_process';
[..]

  pythonPrint = (type) => {
    logger.debug(PRINTER.PYTHON_PRINT, { type });

    const spawnProcess = spawn('python3', ['./print.py', type]);

    spawnProcess.stdout.on('data', (msg) => {
      logger.debug(PRINTER.PYTHON_PRINT, { type, msg: msg.toString() });
    });
    spawnProcess.stderr.on('data', (msg) => {
      logger.error(PRINTER.PYTHON_PRINT, { type, msg: msg.toString() });
    });
    spawnProcess.on('error', (msg) => {
      logger.error(PRINTER.PYTHON_PRINT, { type, msg: msg.toString() });
    });
  };
domingguss commented 11 months ago

workaround is set all log files (in folder ./logs) to current user (pi), so

sudo chown pi:pi ./logs -R