Open ogrisel opened 1 year ago
Actually, I read the error message and indeed my laptop was under heavy memory pressure when I ran this test. Once I shutdown a VM concurrently running on this laptop to debug a Windows-specific problem, the activity monitor said I am no longer under memory pressure and indeed, I can no longer reproduce the failure.
So everything is fine.
Actually I can still reproduce this failure quite often even when the memory pressure on my system is low...
I noticed that the ordering of the tests run by pytest when running this command is not deterministic. Sometimes, when this test is run among the firsts, I do not get the failure. So there is definitely a side effect from one of the other tests in this module that causes the problem.
Also, I can reproduce with tags/3.3.0
so it's not related to one of the recently merged PRs.
Ok I narrowed it down to the dependency between 2 tests only:
pytest -v tests/test_process_executor_loky.py -x -k "TestsProcessPoolLokyShutdown and (test_processes_crash_handling_after_executor_gc or test_shutdown_and_kill_workers)"
when test_shutdown_and_kill_workers
is run first, then test_processes_crash_handling_after_executor_gc
crashes but not otherwise.
Interestingly enough, I cannot reproduce with Python 3.10 because the ordering of those tests becomes deterministic and always in the order that works...
It seems that the ExecutorManagerThread
of the executor of the previous test is still running the kill_workers
method when the subsequent process starts.
It seems that the ExecutorManagerThread of the executor of the previous test is still running the kill_workers method when the subsequent process starts.
Looking at the logs, and adding more logs, it seems that the executor manager thread of the main process in the previous test has completed before starting to run the subsequent test. Furthermore, that would not have explained why we get a sigsev (segfault) in the subsequent GC test.
Here is a failure I can trigger regularly on macOS M1 with Python 3.11.0.
Note that while the problem always happens in
test_processes_crash_handling_after_executor_gc
, this failure does not happen if I run this test in isolation, so it must have a dependency on some side effects of the previous tests run inTestsProcessPoolLokyShutdown
.