has a very bad side-effect of triggering torch.multiprocessing since it executes on cpu.
as a result, torch will start cpu cores number of child processes (on my system its 32 child python processes).
best-case: clean exit from a parent app using ctrl+c is no longer possible as KeyboardInterrupt triggers
a massive traceback (over 200 lines).
worst-case: they do not exit do not exit and become defunct. in that case, they also do not release gpu resources, so hard-reboot is needed. yes, torch.multiprocessing actually has a warning in their docs that this is a possible scenario.
traceback looks like:
Process ForkProcess-2:
Process ForkProcess-3:
Process ForkProcess-8:
Process ForkProcess-6:
Process ForkProcess-1:
...
KeyboardInterrupt
File "/usr/lib/python3.11/multiprocessing/synchronize.py", line 95, in __enter__
return self._semlock.__enter__()
^^^^^^^^^^^^^^^^^^^^^^^^^
simply setting K_DIFFUSION_USE_COMPILE=0 env variable disables compile and issue is gone.
but default behavior is more than suspect - i suggest to revisit this.
I have fixed this in my next development branch by deferring the compiles until something actually tries to use the compiled kernel and will backport it soon. :)
torch.compile
triggered here https://github.com/crowsonkb/k-diffusion/blob/f4a74f1ec906cb62916f58288ec73ef0330ba446/k_diffusion/models/image_transformer_v1.py#L89-L92has a very bad side-effect of triggering
torch.multiprocessing
since it executes on cpu. as a result, torch will start cpu cores number of child processes (on my system its 32 child python processes).KeyboardInterrupt
triggersa massive traceback (over 200 lines).
torch.multiprocessing
actually has a warning in their docs that this is a possible scenario.traceback looks like:
simply setting
K_DIFFUSION_USE_COMPILE=0
env variable disables compile and issue is gone. but default behavior is more than suspect - i suggest to revisit this.