Open ddelange opened 1 year ago
That sounds like a good idea, feel free to submit a PR and link it back to this issue.
I think this is a pretty low-level change in need of someone who's deep into the source code, especially with regard to start methods:
concurrent.futures
has only implemented it for the spawn
start methodmultiprocessing.pool
seems to not have this limitation (but is not an executor)
Hi 👋
Analogous to
concurrent.futures.ProcessPoolExecutor
'smax_tasks_per_child
(added in cp3.11) andmultiprocessing.pool.Pool
'smaxtasksperchild
(added in cp3.2) keyword arguments, it would be great to be able to control after how many completed tasks a loky subprocess is flushed and replaced with a new subprocess.Our dask workers are currently consistently facing
loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
:Most likely caused by upstream memory leaks in
lxml
, hitting our 60GiB mem limit over time due to running the same loky pool subprocesses over 5+ hours. Periodically flushing the workers (spawn
start method) will most likely fix these errors.Many thanks!