multi-gpu running error

OPTML-Group / Unlearn-Saliency

[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation" by Chongyu Fan*, Jiancheng Liu*, Yihua Zhang, Eric Wong, Dennis Wei, Sijia Liu

MIT License

90 stars 12 forks source link

Thank you for this great job.

I just met a problem. when I generate the original model, if I use 1 GPU, it is running well, however, if I run it with two GPUs, I got the following issue. do you know what happened?

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

It seems like that this problem is caused by

hs = [self.conv_in(x)]

in models/diffusion.py

OPTML-Group / Unlearn-Saliency

multi-gpu running error #13