Open JanineCHEN opened 4 years ago
What was the solution?
What was the solution?
Hi, it was the residual batch that caused the problem, you can either drop_last when constructing the dataloader or increase the number of epochs to avoid using the last batch.
Well, neither of them were solving this error on my side. I get this error when I use num_G_accumulations or num_D_accumulations more than 2.
I use drop_last and it works. I am using 4 GPU and batch size 52.
Hey, I am a student trying to reproduce the training process using my own dataset. I got the following error right after the first Epoch of training is finished:
I execute the training using
sh scripts/launch_BigGAN_bs256x8.sh
with my own dataset, the dataset has been transformed into HDF5 format without any error. The content of launch_BigGAN_bs256x8.sh I used:I am not sure if this has something to do with the size of my dataset or number of classes? If so, how could I adjust the parameters? Or any other idea why this issue comes into place and how to tackle it? Any help would be very much appreciated! Thanks a bunch in advance.