The performance gap on 8 GPUs

menyifang / ADGAN

The Implementation of paper "Controllable Person Image Synthesis with Attribute-Decomposed GAN" CVPR 2020 (Oral); Pose and Appearance Attributes Transfer;

476 stars 90 forks source link

The performance gap on 8 GPUs #16

Open ChongjianGE opened 4 years ago

ChongjianGE commented 4 years ago

Hi @menyifang Thanks for the great work. Recently, I have trained the model on 8GPUs. Unfortunatelly, the result seems worse compared with 2GPUs. The performance of 1000 epochs could be seen as below. I wonder if you have any suggestions or explainations on the performance gap between 8GPUs and 2GPUs? epoch884_vis

genghisun commented 4 years ago

Did you modify some of the code? What GPU are you using? (like V100*8?) How long did the 1000 epochs training take?

ChongjianGE commented 4 years ago

V100*8 are used in my exp. 2 days are required under 1000 epoches with 24 batches.

menyifang commented 4 years ago

Hi @ChongjianGE, we trained the model with 2* V100 GPUs, I think more GPUs with larger batch size may cause a difference. Due to the instability of GAN, you can also try more times to avoid the exception.

DragonZzzz commented 3 years ago

@ChongjianGE Hi~ Do you find proper hyper-parameters to train the network on 8GPUs？

ChongjianGE commented 3 years ago

Hi, @DragonZzzz I didn't have a further try on training with 8GPUs.

DragonZzzz commented 3 years ago

@ChongjianGE Have you reproduced the results by training? I'm training the network on 2GPUs but the training speed is too low.

Amazingren commented 3 years ago

@ChongjianGE Have you reproduced the results by training? I'm training the network on 2GPUs but the training speed is too low. Hi，how long would it cost for training with 2GPUs?