python / cpython

The Python programming language
https://www.python.org
Other
63.49k stars 30.4k forks source link

InterpreterPoolExecutor workers do not inherit modifications made to sys.path before starting. #126714

Open TkTech opened 3 days ago

TkTech commented 3 days ago

Bug report

Bug description:

All existing executors will reflect modifications to sys.path in the child thread/process/whatever that gets started. However the new InterpreterPoolExecutor is not copying this behavior, leading to unexpected import errors when for example running sub-intepreters inside of pytest, as the test directory is added to the sys.path in the parent interpreter but will be lost in the sub-interpreter.

This is easily worked around by using an initializer:

pool = InterpreterPoolExecutor(
        max_workers=queue.concurrency,
        initializer=on_initialize_worker,
        initargs=(sys.path,),
 )

 def on_initialize_worker(parent_sys_path: list[str]):
        """
        This method is called in each worker before it begins running jobs.
        It can be used to perform any necessary setup, such as loading NLTK
        datasets or calling ``django.setup()``.

        By default, it replaces the running job's ``sys.path`` with the workers.
        """
        # Unlike all other executors, the InterpreterPoolExecutor does not
        # automatically inherit the parent process's sys.path. This is a
        # workaround to ensure that the worker has the same sys.path as the
        # parent process or tests will fail.
        sys.path = parent_sys_path

This behavior makes sense for sub-interpreters themselves, but probably not for the InterpreterPoolExecutor.

cc. @ericsnowcurrently

CPython versions tested on:

3.13, 3.14

Operating systems tested on:

Linux

ZeroIntensity commented 3 days ago

I'm guessing modifications to sys.path just aren't copied to the next interpreter. (I'm not sure we copy anything in the sysdict over to a subinterpreter at all.)