Open ChongjianGE opened 4 years ago
Did you modify some of the code? What GPU are you using? (like V100*8?) How long did the 1000 epochs training take?
V100*8 are used in my exp. 2 days are required under 1000 epoches with 24 batches.
Hi @ChongjianGE, we trained the model with 2* V100 GPUs, I think more GPUs with larger batch size may cause a difference. Due to the instability of GAN, you can also try more times to avoid the exception.
@ChongjianGE Hi~ Do you find proper hyper-parameters to train the network on 8GPUs?
Hi, @DragonZzzz I didn't have a further try on training with 8GPUs.
@ChongjianGE Have you reproduced the results by training? I'm training the network on 2GPUs but the training speed is too low.
@ChongjianGE Have you reproduced the results by training? I'm training the network on 2GPUs but the training speed is too low. Hi,how long would it cost for training with 2GPUs?
Hi @menyifang Thanks for the great work. Recently, I have trained the model on 8GPUs. Unfortunatelly, the result seems worse compared with 2GPUs. The performance of 1000 epochs could be seen as below. I wonder if you have any suggestions or explainations on the performance gap between 8GPUs and 2GPUs?