Delgan / loguru

Python logging made (stupidly) simple
MIT License
19.93k stars 698 forks source link

The problem of log rotation in a multiprocess environment. #1216

Open kivdev opened 1 month ago

kivdev commented 1 month ago

I encountered an issue with log rotation when using Uvicorn and Taskiq. I decided to investigate the problem and discovered incorrect logic in the library (in my opinion).

OS: Linux (Debian) Loguru version: >=0.7.2

Problem: Each worker shifts the "user.loguru_crtime" file attribute to its own, which affects previously running workers since their rotation time becomes earlier than that of the last launched worker. As a result, each worker performs its own rotation, which is incorrect behavior.

How I solved the problem:

  1. Removed all declarations of the "user.loguru_crtime" parameter.
  2. Set the "user.loguru_crtime" parameter only when a new file is created.
  3. Checked the rotation threshold to ensure it is up-to-date.

This solution helped synchronize the workers' log rotation, which resolved the issue.

P.s. I used the "enqueue" and "watch" parameters, but they didn't solve the problem until I made my fix.

Diff:

diff --git a/loguru/_file_sink.py b/loguru/_file_sink.py
index 4b09b06..303d48d 100644
--- a/loguru/_file_sink.py
+++ b/loguru/_file_sink.py
@@ -114,13 +114,13 @@ class Rotation:
         def __call__(self, message, file):
             record_time = message.record["time"]

+            filepath = os.path.realpath(file.name)
+            creation_time = get_ctime(filepath)
+            start_time = datetime.datetime.fromtimestamp(
+                creation_time, tz=record_time.tzinfo
+            )
+
             if self._limit is None:
-                filepath = os.path.realpath(file.name)
-                creation_time = get_ctime(filepath)
-                set_ctime(filepath, creation_time)
-                start_time = datetime.datetime.fromtimestamp(
-                    creation_time, tz=datetime.timezone.utc
-                )

                 time_init = self._time_init

@@ -147,6 +147,10 @@ class Rotation:
             if self._limit.tzinfo is None:
                 record_time = record_time.replace(tzinfo=None)

+            start_time = start_time.replace(tzinfo=None)
+            if self._limit <= start_time:
+                self._limit = self._step_forward(self._limit)
+
             if record_time >= self._limit:
                 while self._limit <= record_time:
                     self._limit = self._step_forward(self._limit)
@@ -223,7 +227,16 @@ class FileSink:
         os.makedirs(dirname, exist_ok=True)

     def _create_file(self, path):
+        set_time = False
+
+        if not os.path.exists(path):
+            set_time = True
+
         self._file = open(path, **self._kwargs)
+
+        if set_time:
+            set_ctime(path, datetime.datetime.now().timestamp())
+
         self._file_path = path

         if self._watch:
@@ -291,7 +304,6 @@ class FileSink:

         if is_rotating:
             self._create_file(new_path)
-            set_ctime(new_path, datetime.datetime.now().timestamp())

     @staticmethod
     def _make_glob_patterns(path):
Delgan commented 1 month ago

Hi @kivdev, thanks for investigating the issue and proposing a fix.

The underlying problem, as you understood, is that each worker runs its own instance of the log file handler. Assuming they're all configured to write log to the same file, e.g. if logger.add("file.log") appears at the root level of the program, then it will inevitably cause conflicts.

Modifying how the file creation-time is read and written is unfortunately not enough. It may make the problem less visible, but there's always a chance that something will go wrong.

Uvicorn workers must be synchronized. They must share the same handler with enqueue=True. However, when child workers are started by Uvicorn, they do not inherit the parent handler. Instead, they create a new one that happens to target an already in-use log file.

See this guide about how it should be done correctly: Compatibility with multiprocessing using enqueue argument.

Unfortunately, as far as I know, Uvicorn does not provide any API that is compatible with the idiom of "inherited" objects. We can't pass the logger from the main process to the child ones.

I'm planning to implement some kind of helper (that will use socket communication between the processes instead of inheritance). However in the meantime, the best thing is to use separate file for separate workers.