OPTML-Group / Unlearn-Saliency

[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation" by Chongyu Fan*, Jiancheng Liu*, Yihua Zhang, Eric Wong, Dennis Wei, Sijia Liu
https://www.optml-group.com/posts/salun_iclr24
MIT License
90 stars 12 forks source link

multi-gpu running error #13

Closed FightingFighting closed 4 months ago

FightingFighting commented 4 months ago

Thank you for this great job.

I just met a problem. when I generate the original model, if I use 1 GPU, it is running well, however, if I run it with two GPUs, I got the following issue. do you know what happened?

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

It seems like that this problem is caused by

hs = [self.conv_in(x)]

in models/diffusion.py

a-F1 commented 4 months ago

Thank you for your interest and support in our work! Unfortunately, our code does not currently support multi-GPUs.

Our stable diffusion model uses the Compvis format. I recommend checking out https://github.com/CompVis/stable-diffusion for more solutions related to multi-GPUs.