python / cpython

The Python programming language
https://www.python.org
Other
63.22k stars 30.27k forks source link

concurrent.futures.ProcessPoolExecutor does not work in venv on Windows #79978

Closed f7a29477-11b6-4add-a132-918438c57010 closed 5 years ago

f7a29477-11b6-4add-a132-918438c57010 commented 5 years ago
BPO 35797
Nosy @pfmoore, @pitrou, @tjguk, @ned-deily, @zware, @eryksun, @zooba, @chrullrich, @applio, @miss-islington
PRs
  • python/cpython#11676
  • python/cpython#11676
  • python/cpython#11676
  • python/cpython#11679
  • python/cpython#11679
  • python/cpython#11679
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/zooba' closed_at = created_at = labels = ['type-bug', '3.8', 'release-blocker', '3.7', 'library', 'OS-windows'] title = 'concurrent.futures.ProcessPoolExecutor does not work in venv on Windows' updated_at = user = 'https://github.com/chrullrich' ``` bugs.python.org fields: ```python activity = actor = 'steve.dower' assignee = 'steve.dower' closed = True closed_date = closer = 'steve.dower' components = ['Library (Lib)', 'Windows'] creation = creator = 'chrullrich' dependencies = [] files = [] hgrepos = [] issue_num = 35797 keywords = ['patch', 'patch', 'patch', '3.7regression'] message_count = 17.0 messages = ['334142', '334151', '334154', '334159', '334162', '334163', '334164', '334167', '334170', '334171', '334172', '334187', '334240', '334244', '334373', '334375', '334377'] nosy_count = 10.0 nosy_names = ['paul.moore', 'pitrou', 'tim.golden', 'ned.deily', 'zach.ware', 'eryksun', 'steve.dower', 'chrullrich', 'davin', 'miss-islington'] pr_nums = ['11676', '11676', '11676', '11679', '11679', '11679'] priority = 'release blocker' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue35797' versions = ['Python 3.7', 'Python 3.8'] ```

    f7a29477-11b6-4add-a132-918438c57010 commented 5 years ago

    Using concurrent.futures.ProcessPoolExecutor on Windows fails immediately with a lot of exceptions of the "access denied", "file not found", and "invalid handle" varieties. Running the script that creates the ProcessPoolExecutor from the main system-wide installation works correctly.

    Due to Windows' infamous lack of fork(), ProcessPoolExecutor launches its worker processes by setting up an inheritable handle to a pipe and passing the handle on the command line.

    In a venv situation, it appears that the venv's python.exe internally launches the parent environment's python.exe and passes on its command line, but not its handle table. This sub-subprocess therefore does not have the original handle, and may have a different handle at the same index.

    Output of the ProcessPoolExecutor example program from the docs when run with the main installation:

    C:\Daten>py cft.py 112272535095293 is prime: True 112582705942171 is prime: True 112272535095293 is prime: True 115280095190773 is prime: True 115797848077099 is prime: True 1099726899285419 is prime: False

    Output when run using a venv:

    C:\Daten>pyv\v37\Scripts\python.exe cft.py
    Process SpawnProcess-4:
    Traceback (most recent call last):
      File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap
        self.run()
      File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 99, in run
        self._target(*self._args, **self._kwargs)
      File "C:\Program Files\Python37\lib\concurrent\futures\process.py", line 226, in _process_worker
        call_item = call_queue.get(block=True)
      File "C:\Program Files\Python37\lib\multiprocessing\queues.py", line 93, in get
        with self._rlock:
      File "C:\Program Files\Python37\lib\multiprocessing\synchronize.py", line 95, in __enter__
        return self._semlock.__enter__()
    PermissionError: [WinError 5] Access is denied
    Process SpawnProcess-5:
    Traceback (most recent call last):
      File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap
        self.run()
      File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 99, in run
        self._target(*self._args, **self._kwargs)
      File "C:\Program Files\Python37\lib\concurrent\futures\process.py", line 226, in _process_worker
        call_item = call_queue.get(block=True)
      File "C:\Program Files\Python37\lib\multiprocessing\queues.py", line 93, in get
        with self._rlock:
      File "C:\Program Files\Python37\lib\multiprocessing\synchronize.py", line 95, in __enter__
        return self._semlock.__enter__()
    PermissionError: [WinError 5] Access is denied
    Traceback (most recent call last):
      File "cft.py", line 28, in <module>
        main()
      File "cft.py", line 24, in main
        for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
      File "C:\Program Files\Python37\lib\concurrent\futures\process.py", line 476, in _chain_from_iterable_of_lists
        for element in iterable:
      File "C:\Program Files\Python37\lib\concurrent\futures\_base.py", line 586, in result_iterator
        yield fs.pop().result()
      File "C:\Program Files\Python37\lib\concurrent\futures\_base.py", line 432, in result
        return self.__get_result()
      File "C:\Program Files\Python37\lib\concurrent\futures\_base.py", line 384, in __get_result
        raise self._exception
    concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    zooba commented 5 years ago

    Good catch! I'm surprised we don't have any tests for this, but I guess we don't really create any virtual environments in our test suite. A shame nobody hit it during RC.

    I don't actually know the best fix for this. The venv python.exe script is the same as the py.exe launcher, which means neither is passing on inherited handles. Any ideas?

    pitrou commented 5 years ago

    Hmm... Out of curiosity, why is the venv's python.exe not simply a symlink to the original python.exe? What is the point of going through py.exe here?

    Also, is this a regression? Otherwise I don't think it has to be a release blocker.

    zooba commented 5 years ago

    You can't create symlinks on Windows without additional user permissions, and the old method of copying most of the binaries was slow and made DLL hell worse, as well as simply not working with the Windows Store model that does not let random executables load DLLs from within the app.

    Because of the last one, I backported the fix (which is an independent executable that launches the correct one with additional context to know it's a venv) that was mostly there for 3.8 already. This also fixes the issue where upgrading Python would force you to recreate every virtual environment to avoid mismatched binary versions (which is also why it wasn't an incompatible change - the update was already incompatible).

    So all in all, an overall positive change. It's unfortunate that this problem slipped through testing, and I assume it's going to exist for many cases using handle inheritance. When I get a chance I'll be looking up ways to simply flag the new Python process it creates as "inherit everything", but there may be other data structures that don't transfer automatically (such as file descriptors, which the OS knows nothing about and can't do automatically).

    eryksun commented 5 years ago

    The launchers have inheritance enabled (for the duplicated standard handles, which should be made inheritable via SetHandleInformation instead, but that's a separate issue). However, multiprocessing doesn't use inheritable handles. Spawning processes with inheritance would be a problem in a general-purpose library. Instead it depends on either the parent manually duplicating the handle to the child or vice versa (via the child opening the parent PID that's passed to it). The problem for a virtual environment that's using the launcher is the former, where the parent duplicates to the child (e.g. for the SemLock semaphore). We need a way to allow multiprocessing to spawn the real python.exe instead of the launcher executable that's set as sys.executable.

    zooba commented 5 years ago

    We need a way to allow multiprocessing to spawn the real python.exe instead of the launcher executable that's set as sys.executable.

    Got to a computer and had just reached the same conclusion.

    Given the environment is inherited, it's easy to do:

    >> import _winapi >> import multiprocessing.spawn >> multiprocessing.spawn.set_executable(_winapi.GetModuleFileName(0))

    This may interfere with others who currently override sys.executable when hosting Python, since they'll get their override by default. But changing sys.executable to _not be the venv one will break anyone who doesn't correctly inherit environment variables (specifically, \_PYVENV_LAUNCHER__, but this is deliberately not documented ;) ).

    Perhaps we just change the multiprocessing default on Windows when sys.base_prefix != sys.prefix?

    f7a29477-11b6-4add-a132-918438c57010 commented 5 years ago

    I have two ideas, but no idea if either is remotely feasible:

    1. Replicate the launcher's selection logic in multiprocessing and avoid the intermediate step via sys.executable. Perhaps put it into pythonXX.dll or export it from the launcher executable so the C implementation can be shared by both users.

    2. Handle the command line argument(s) that control how multiprocessing pulls the handle from the parent process in the launcher and perform the operation so the sub-subprocess can in turn get it from the launcher.

    I had a third one, but it was so crazy I must have preemptively forgotten it.

    Finally, I do not think this is a regression in 3.7 (although I have not tested it with anything earlier), but unless the implementation of multiprocessing has changed radically in 3.7 I cannot see how it could have (not) worked any differently before.

    On the other hand, venv is essentially the "official" way of installing Python applications these days, so the issue should have a high priority for fixing anyway.

    zooba commented 5 years ago

    The first idea makes sense, but because of how we've already architected things (and the direction we're trying to rearchitect things) it isn't really that feasible.

    The second idea could be good. It isn't that hard to make globally named handles that can be reopened in the child process, and that avoids the need for a coherent inheritance chain of processes. Maybe it could break other scenarios that do rely on inheritance though? (But aren't those already broken? All this is *just* outside the edge of my experience, so I'd have to try them all out to be sure.)

    It's a regression in 3.7.2, which is when the venv script changed. As I said, updating from 3.7.1 to 3.7.2 was going to change the venv "script" anyway (which was just the main Python executable and its binaries), so it should have been no more breaking than that. But it was, so I consider it a regression (in venv, to be clear, not in multiprocessing).

    pitrou commented 5 years ago

    Hmm, if multiprocessing exhibits the problem, I wouldn't be surprised if some third-party libraries also do. I think sys.executable should really point to the proper executable.

    But changing sys.executable to _not be the venv one will break anyone who doesn't correctly inherit environment variables (specifically, \_PYVENV_LAUNCHER__, but this is deliberately not documented ;) ).

    Well, it will also break if other environment variables are required, e.g. PYTHONPATH. Usually you want to inherit environment variables (including such fundamental stuff such as HOME), because you don't know which ones are actually required for a given workload.

    So I think "breaking if environment variables are not inherited" is a less severe failure mode than this issue is.

    zooba commented 5 years ago

    So I think "breaking if environment variables are not inherited" is a less severe failure mode than this issue is.

    ISTR that having sys.executable point to a path outside of sys.prefix breaks the site module in some way, so that would move the fix over there. But again, going from memory - I'd have to try it (and doubtful that I'm getting to it today).

    eryksun commented 5 years ago

    multiprocessing could be redesigned to make the child process responsible for duplicating all handles from its logical parent (by PID passed on the command line -- regardless of whatever its real parent is). That avoids the problem of the parent mistakenly duplicating handles to a launcher child process that has no idea that the parent has done this and certainly doesn't have the architecture in place to deal with it.

    That said, inserting the launcher process as a middle man for every worker process is wasteful. Creating processes without fork is expensive. Also, relying on the "__PYVENV_LAUNCHER__" environment variable for this is problematic. Environment variables are inherited by default, so it leaks to a completely unrelated python.exe process. For example, if we run subprocess.call(r'C:\Program Files\Python37\python.exe') from a virtual environment, currently we end up with an overridden sys.executable. I would prefer a -X command-line option, such as "VIRTUAL_ENV". multiprocessing could propagate this option to worker processes.

    zooba commented 5 years ago

    I would prefer a -X command-line option, such as "VIRTUAL_ENV". multiprocessing could propagate this option to worker processes.

    Right, so would I, but it needs to be propagated basically everywhere (unless "everywhere" all defers through multiprocessing, which would be great). I chose to go with the same solution as used on macOS for the same problem, though it had to be implemented in a different part of the code. Plenty of other environment variables will cause problems if they propagate to a totally different python.exe, so perhaps we should document __PYVENV_LAUNCHER__ as one to clear in this case (along with PYTHONHOME and PYTHONPATH)?

    For future enhancement, there's also the stalled PEP-582 that would "fix" venv's reliance on shell/environment variables by having a different way to enable a separate package directory (FWIW, it stalled because the PEP doesn't fix *all* packaging problems, but only some).

    But as it stands, people rely on there being something they can launch in their venv directory, and that thing has to know (at least for a given terminal session) how to propagate the correct search paths to child processes. The current solution is the simplest one that ensures the most compatibility for the least amount of risk, even though that was not zero risk, as we've seen. And it also has to remain for 3.7 now, since venvs are no longer automatically broken when updating to 3.7.3.

    Changing sys.executable to the final executable and trusting/relying/hoping that everyone who invokes it also allows environment variables to propagate is probably the best solution (including whatever site.py fixes are required, if any).

    eryksun commented 5 years ago

    The current solution is the simplest one that ensures the most compatibility for the least amount of risk, even though that was not zero risk, as we've seen.

    How about making the py[w].exe launcher unset __PYVENV_LAUNCHER__ whenever it runs a registered version explicitly?

    We could also modify the embedded-script (entry-point) launchers, which would be simpler if we had them in core instead of distlib. They could be updated to look for pyvenv.cfg. If found, it would override the shebang and set the __PYVENV_LAUNCHER environment variable. This would eliminate the interjected process in a 3.7.2 virtual environment, e.g. pip.exe -> python.exe (venv launcher) -> python.exe (installed) would become pip.exe -> python.exe (installed). More importantly, it would unset __PYVENV_LAUNCHER in case it's not a virtual environment script.

    This way running pip.exe for an installed Python won't end up installing into a virtual environment, as can happen now in 3.7.2. For example:

        >>> print(os.environ['__PYVENV_LAUNCHER__'])
        C:\Temp\env37\Scripts\python.exe
    
        >>> import requests
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ModuleNotFoundError: No module named 'requests'
    >>> pip = 'Scripts\\pip.exe'
    >>> out = subprocess.check_output('{} install requests'.format(pip))
        >>> import requests
        >>> requests.__version__
        '2.21.0'

    Note that "Scripts\\pip.exe" in a CreateProcess command line gets resolved first relative to the application directory, before trying the current directory, system directories, and PATH directories.

    zooba commented 5 years ago

    How about making the py[w].exe launcher unset __PYVENV_LAUNCHER__ whenever it runs a registered version explicitly?

    I think this is a good idea, though it can be unset unconditionally. If we're in a virtual environment, then it will run the intermediate process which will set the variable again (this is unavoidable unless we include specialized knowledge in the launcher to find the base python.exe from the pyvenv.cfg, which I'd rather not duplicate in so many places, when you consider that the script launcher will need it too).

    As for updating the script launchers, on one hand I'd love to have them, but on the other it's more for us to maintain and there's definitely been benefit from it being backported for us. As long as they run the correct python.exe, I think it's fine.

    I think that CreateProcess oddity where it searches the current process's directory first is going to catch people out regardless (I'm surprised it hasn't come up before), but I guess most people play it safe and use full paths.

    zooba commented 5 years ago

    New changeset 4e02f8f8b4baab63f927cfd87b401200ba2969e9 by Steve Dower in branch 'master': bpo-35797: Fix default executable used by the multiprocessing module (GH-11676) https://github.com/python/cpython/commit/4e02f8f8b4baab63f927cfd87b401200ba2969e9

    zooba commented 5 years ago

    I realised while doing the fix that changing sys.executable to point to the "real" python.exe would break scenarios that involve generating scripts. All of those have been relying on sys.executable launching the venv, which would break if we changed it there.

    Instead, we now just load the direct executable name in multiprocessing as I posted earlier when we detect that __PYVENV_LAUNCHER__ was set on Windows. Having a "sys.base_executable" property might have been valuable here, but as we don't need it for all platforms I just used the _winapi module.

    miss-islington commented 5 years ago

    New changeset 6a9c0fca3f2f93681468b51929472f4433753f25 by Miss Islington (bot) in branch '3.7': bpo-35797: Fix default executable used by the multiprocessing module (GH-11676) https://github.com/python/cpython/commit/6a9c0fca3f2f93681468b51929472f4433753f25