Delgan / loguru

Python logging made (stupidly) simple
MIT License
18.69k stars 682 forks source link

BrokenProcessPool encountered when passing logging configuration to spawn subprocess on linux #1160

Closed kujenga closed 1 week ago

kujenga commented 1 week ago

I am attempting to get loguru working for spawned subprocesses and have been running into a concurrent.futures.process.BrokenProcessPool crash after attempting to follow the advice in issues such as https://github.com/Delgan/loguru/issues/818 and as illustrated in this PR: https://github.com/Delgan/loguru/pull/1069/files

This crash only happens on Linux, the issue is not reproducible on macOS.

I am hoping for some guidance on what might be wrong here and if this is something that could be fixed within loguru, or that I can change on my end.

This is a program that reproduces the issue:

import multiprocessing
import sys
from concurrent.futures import ProcessPoolExecutor, as_completed

from loguru import logger

def _processpool_initializer(parent_logger):
    logger._core = parent_logger._core  # type: ignore

    print("before failure", flush=True)
    logger.info("BOOM")
    print("after failure", flush=True)

def upper(v):
    return str(v).upper()

def main():
    logger.remove()
    logger.add(sys.stdout, enqueue=True)

    executor = ProcessPoolExecutor(
        max_workers=1,
        initializer=_processpool_initializer,
        initargs=(logger,),
        mp_context=multiprocessing.get_context("spawn"),
    )

    futures = {}
    for input in ["a", "b", "c"]:
        futures[executor.submit(upper, input)] = input

    for future in as_completed(futures):
        input = futures[future]
        r = future.result()
        print(f"Result: {r}", flush=True)

if __name__ == "__main__":
    main()

This is the output of the above program:

before failure
Traceback (most recent call last):
  File "/workspaces/plus/python/pdf-anonymizer/scripts/tmp/processpool-loguru.py", line 42, in <module>
    main()
  File "/workspaces/plus/python/pdf-anonymizer/scripts/tmp/processpool-loguru.py", line 37, in main
    r = future.result()
        ^^^^^^^^^^^^^^^
  File "/home/vscode/.rye/py/cpython@3.11.3/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.rye/py/cpython@3.11.3/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

I am on version:

loguru==0.7.2

With a bit of debugging into loguru, I was able to determine that the subprocesses seems to be crashing on this line: https://github.com/Delgan/loguru/blob/6af1cafaa258c5337cafd249f0f81730cba3e8f0/loguru/_handler.py#L204

Delgan commented 1 week ago

Hi @kujenga.

Thanks for the reproducible example. Curiously, the program is crashing for a different reason on my computer, it might be due to difference in Python version used (3.12.3 in my case):

RuntimeError: A SemLock created in a fork context is being shared with a process in a spawn context. This is not supported. Please use the same context to create multiprocessing objects and Process.

The above error indicates that there is incompatibility between the multiprocessing "start method" used internally by Loguru and the one of the ProcessPoolExecutor. This is expected: by default Loguru will use the default start method of the platform, which is "fork" on Linux while your ProcessPoolExecutor is configured with the "spawn" one.

You can try to setup your Loguru sink with the desired "spawn" context. It fixed the error on my computer:

def main():
    mp_context = multiprocessing.get_context("spawn")
    logger.remove()
    logger.add(sys.stdout, enqueue=True, context=mp_context)

    executor = ProcessPoolExecutor(
        max_workers=1,
        initializer=_processpool_initializer,
        initargs=(logger,),
        mp_context=mp_context,
    )
kujenga commented 1 week ago

Thank you for the quick response!

Got it, I had been in an environment with Python 3.11, what you got is a much more informative error message than the one I had posted. I was able to get the same error message in a python:3.12-slim docker container running on macos.

From re-reading https://loguru.readthedocs.io/en/stable/resources/recipes.html#compatibility-with-multiprocessing-using-enqueue-argument today it does appear that you address this exact issue and provide a solution.

Given the error message I originally saw, I had thinking that it might be beneficial for loguru to take an approach like capturing the currently multiprocessing context/start method and adding a verification that it hasn't changed to provide a friendlier error message than the relatively mysterious concurrent.futures.process.BrokenProcessPool, since it's unexpected that a logger would crash a program, but if there is in general already going to be a better error message and as this is seemingly the fault of the stdlib, perhaps it is not required.

It could be also beneficial to link to the above recipe in this section https://loguru.readthedocs.io/en/stable/overview.html#asynchronous-thread-safe-multiprocess-safe

I'll go ahead and close this out since it seems things are basically "working as documented", thanks again!

Delgan commented 1 week ago

Given the error message I originally saw, I had thinking that it might be beneficial for loguru to take an approach like capturing the currently multiprocessing context/start method and adding a verification that it hasn't changed

I don't think that's possible anyway. The exception is raised when ProcessPoolExecutor() is created. At that moment, the logger is not involved, and there is no way for Loguru to know the start method that is passed as the mp_context argument.