Thank you for sharing the code. In the paper, you have mentioned that the Cal-GAN is trained using 4 GPUs and in the repo you kindly provided multigpu training instructions. However, I have faced several issues starting from Batch Norm layer, which does not support DDP as it is. By replacing the BatchNorm with SyncBatchNorm, I was able to launch dataparallel scheme, but the GPU utilization is very low (1-2%). So, my questions are:
Is it possible to launch your implementation with Distributed Data Parallel (DDP) launcher (as described in your manual)?
Dear authors,
Thank you for sharing the code. In the paper, you have mentioned that the Cal-GAN is trained using 4 GPUs and in the repo you kindly provided multigpu training instructions. However, I have faced several issues starting from Batch Norm layer, which does not support DDP as it is. By replacing the BatchNorm with SyncBatchNorm, I was able to launch dataparallel scheme, but the GPU utilization is very low (1-2%). So, my questions are:
Thank you!