Parallel processing of frames with batch stalls on Windows

Jdogzz commented 4 years ago

I've been having an issue with the parallel processing feature of the batch command on Windows 10. My script is as follows, extremely simple:

import trackpy as tp
import pims
frames = pims.open('D:/images/*.png')
f = tp.batch(frames, 41, processes='auto',engine='numba')

This results in a series of error messages that look like this:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\testrun.py", line 4, in <module>
    f = tp.batch(frames, 41, processes='auto',engine='numba')
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\site-packages\trackpy\feature.py", line 552, in batch
    pool, map_func = _get_pool(processes)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\site-packages\trackpy\utils.py", line 488, in _get_pool
    pool = Pool(processes=processes)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\pool.py", line 176, in __init__
    self._repopulate_pool()
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
    w.start()
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\myusername\Anaconda3\envs\testtrackpy\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

By contrast, when running the same script on a Mac, with only the path altered to something like '/Users/myuser/images/*.png', I get the expected ability to process frames in parallel.

For each operating system, this has been tried in a fresh conda environment, Python 3.7, with git, pip, pims, and imageio all installed through conda (along with their dependencies), and installing the current github copy of trackpy through pip. I'm not seeing any cautionary notes about parallel processing specific to Windows in the trackpy documentation, but if there are please point me to the relevant page.

I'm attaching my test dataset in case it is relevant, a series of png images with a white circle on a black background. images.tar.gz

nkeim commented 4 years ago

Thanks for this helpful report. What version of trackpy are you using? And are you running it from the command line?

In the development version on GitHub, parallel batch is tested on Windows, so this bug may be a little complicated.

Jdogzz commented 4 years ago

Both systems report trackpy version 0.4.2+29.g1f720ff. I run the above script from the command line in each case (on Windows, I have previously tried running it in Spyder but no errors are printed to the terminal when doing that, it just hangs. From a command prompt launched from Anaconda Navigator, I can see the above error being printed out repeatedly).

nkeim commented 4 years ago

Thanks. From the documentation, it looks like freeze_support() is only needed when your code is set up to run as a standalone .exe file (i.e. not within python.exe as usual). So it's unclear why you would get that error if you're just running from the command line.

Nonetheless, would you mind trying to add

if __name__ == '__main__':
    freeze_support()

to the top level of your script, right before the code that invokes batch?

Jdogzz commented 4 years ago

Simply copy pasting that in before the batch line (with the appropriate import) didn't work, but rewriting with a main function led to the expected behavior in Spyder and directly from the command line:

import trackpy as tp
import pims
from multiprocessing import freeze_support

def main(): 
    frames = pims.open('D:/images/*.png')  
    f = tp.batch(frames, 41, processes='auto',engine='numba')

if __name__ == '__main__':
    freeze_support()
    main()

As you mentioned it is quite strange the interpreter needed this to work since the documentation said it did not, and is of course more complex than just the 4 lines needed on the Mac.

nkeim commented 4 years ago

Great! Thanks for providing a working example. I'd like to keep this issue open until we can add a mention of this to the batch docstring. (Or feel free to contribute text!)

rbnvrw commented 4 years ago

It is very strange indeed. Especially since we get no errors while testing on Travis. From the documentation (https://docs.python.org/3/library/multiprocessing.html#multiprocessing.freeze_support) it seems like it should not be necessary:

Calling freeze_support() has no effect when invoked on any operating system other than Windows. In addition, if the module is being run normally by the Python interpreter on Windows (the program has not been frozen), then freeze_support() has no effect.

According to this question (https://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing) the problem is this:

On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.

So I think the freeze_support is not needed but the guard is. @Jdogzz could you test your script without the freeze_support to confirm whether this is the issue? Then we can add the proper advice to the docstring. Thanks!

Jdogzz commented 4 years ago

I can confirm that removing only the freeze_support line still allows the script to run (tried on the Windows machine, both from the command line and within spyder).

nkeim commented 4 years ago

That makes sense and I imagine many people run trackpy from a simple script. So now I'm worried about us having merged #606 to use multiprocessing by default — on Windows it could break a lot of existing code.

One way to resolve this is that trackpy could catch a RuntimeError when it invokes multiprocessing, and if the platform is Windows, fall back to a single process and issue a warning. That seems like a sensible compromise.

nkeim commented 4 years ago

Also, thanks @rbnvrw for the detective work! I'm surprised that detail is not in the standard library docs.