limuloo / PyDIff

[IJCAI 2023 ORAL] "Pyramid Diffusion Models For Low-light Image Enhancement" (Official Implementation)
Other
149 stars 8 forks source link

Multiple GPUs Training Problem #14

Closed jinzi98 closed 8 months ago

jinzi98 commented 10 months ago

Hi, Zhou. I have the same GPUs as you(2 * 32GB in one device), so I want to train with two GPUs.

Before that I tried inference with the command you give (CUDA_VISIBLE_DEVICES=0 python pydiff/train.py -opt options/infer.yaml) and it ran successfully.

So I just changed the command to CUDA_VISIBLE_DEVICES=0,1 python pydiff/train.py -opt options/train_v1.yaml but there was an error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat1 in method wrapper_addmm) . I want to know how to solve this error. Thanks!

limuloo commented 8 months ago

I have released the code of training and tested it on the multi -GPU, and you can try it again.