Open jacobtomlinson opened 1 month ago
I found a workaround for this behaviour using a signal to defer the shutdown. This example does not hang.
+import os
+import signal
import sys
from distributed import Client, Scheduler
from distributed.utils import LoopRunner
async def main():
async with Scheduler() as scheduler:
async with Client(scheduler.address, asynchronous=True) as client:
await client.shutdown()
print("Done, exiting")
- sys.exit() # Hangs at this line, comment this out and the program exits as expected
+ os.kill(os.getpid(), signal.SIGINT) # Shutdown using a signal instead
+signal.signal(signal.SIGINT, lambda *_: sys.exit()) # Exit gracefully on signal
loop_runner = LoopRunner(loop=None, asynchronous=False)
loop_runner.run_sync(main)
Describe the issue:
When using the
Scheduler
orWorker
class to start cluster components if the program is exited withsys.exit()
(as is done indask-mpi
) the Python process hangs, likely due to a background thread holding the process open.Minimal Complete Verifiable Example:
I tried to strip things down as far as possible to still reproduce the issue, but I note this doesn't happen when using
asyncio.run(main)
instead of theLoopRunner
that is commonly used in distributed.Anything else we need to know?:
Environment:
2024.4.2
3.11.9