GPU training - Githubissues

wenyang001 commented 5 months ago

How many GPUs did you train your model? Is your model supported for multi-GPUs training as I saw the bash size of bsds was set to only 2?

GuHuangAI commented 5 months ago

@wenyang001

Yes, our model supports multi-GPU training. You can run accelerate config to set the configuration of multi-GPU training.
The results in our paper are trained with a single 3080Ti GPU whose memory is only 12GB. In this way, the batch size can only set to 2. To use larger batch size with the limited GPU memory, we utilize the gradient accumulate trick. We set the gradient_accumulate_every to 8 in the config file, which means that the model's forward function is called 8 times before the parameters are updated. Thus, the real batch size is 2*8. Hope can help you.

wenyang001 commented 5 months ago

Thanks for your reply.

GuHuangAI / DiffusionEdge