torproject / stem

Python controller library for Tor
https://stem.torproject.org/
GNU Lesser General Public License v3.0
279 stars 76 forks source link

Fix multiprocessing with `spawn` & top level imports #152

Closed bryango closed 9 months ago

bryango commented 9 months ago

This fixes test_fibonacci_multiprocessing on MacOS (darwin), which was failing with:

_pickle.PicklingError: Can't pickle <function rebuild_connection at ...>: import of module 'multiprocessing.connection' failed

This is due to the fact that multiprocessing on darwin defaults with the start method of spawn, instead of fork in other POSIX environments.

See: https://docs.python.org/3/library/multiprocessing.html

atagar commented 9 months ago

Thanks. This looks great to me! Couple questions...

+ORIGINAL_PATH = list(sys.path)
+sys.path.append(EXAMPLE_DIR)

We previously toggled sys.path in the setUp and tearDown methods so global state only changed while our tests run.

Unless I'm missing something we should do sys.path = ORIGINAL_PATH after the following import, and revert setUp to not reference ORIGINAL_PATH.

+from multiprocessing import connection  # noqa: E402, F401
+if stem.util.system.is_windows():
+  from multiprocessing import popen_spawn_win32  # noqa: F401
+else:
+  from multiprocessing import popen_spawn_posix  # noqa: F401

What is the purpose of this part? I'm not spotting the reason for these popen imports.

Thanks!

bryango commented 9 months ago

We previously toggled sys.path in the setUp and tearDown methods so global state only changed while our tests run.

Unless I'm missing something we should do sys.path = ORIGINAL_PATH after the following import, and revert setUp to not reference ORIGINAL_PATH.

Yes, you are right! Sorry I missed this... Force-pushed and fixed.

+from multiprocessing import connection  # noqa: E402, F401
+if stem.util.system.is_windows():
+  from multiprocessing import popen_spawn_win32  # noqa: F401
+else:
+  from multiprocessing import popen_spawn_posix  # noqa: F401

This is to prevent PicklingErrors of import of module 'multiprocessing.{connection,popen_spawn_win32,popen_spawn_posix}' failed. I am not sure if this is the right way to do it though.

Note that if docs/_static/example/fibonacci_multiprocessing.py is directly called with python, these imports are not necessary. The issue seems to be that we are calling it from the test module, with fibonacci_multiprocessing.main(), such that the spawned process fails to locate the required modules multiprocessing.{connection,popen_spawn_win32,popen_spawn_posix} automatically.

bryango commented 9 months ago

Note that if docs/_static/example/fibonacci_multiprocessing.py is directly called with python, these imports are not necessary.

I have pushed a new commit which directly calls fibonacci_multiprocessing.py with subprocess.run. This should be a much cleaner fix.

atagar commented 9 months ago

Looks great to me! Thanks, pushed.