ajbrock / BigGAN-PyTorch

The author's officially unofficial PyTorch BigGAN implementation.
MIT License
2.84k stars 470 forks source link

D loss is zero from the start #66

Closed entrpn closed 4 years ago

entrpn commented 4 years ago

I'm trying to train with my own dataset, but the training starts with a D loss of zero for both real and fake, for example:

2219/15313 ( 14.48%) (TE/ET1k: 159:38 / 56:17) ^Cr: 2218, G_loss : +0.754, D_loss_real : +0.000, D_loss_fake : +0.000

I'm running the following:

nohup python train.py --dataset I128_hdf5 --shuffle --num_workers 1 --batch_size 32 --num_G_accumulations 4 --num_D_accumulations 4 --num_D_steps 1 --G_lr 1e-4 --D_lr 1e-4 --D_B2 0.999 --G_B2 0.999 --G_attn 64 --D_attn 64 --G_nl inplace_relu --D_nl inplace_relu --SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 --G_ortho 0.0 --G_shared --G_init ortho --D_init ortho --hier --dim_z 512 --shared_dim 128 --G_eval_mode --G_ch 96 --D_ch 96 --num_epochs=5000 --ema --use_ema --ema_start 20000 --test_every 2000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 --use_multiepoch_sampler &

Any idea what I'm doing wrong here? Thanks!

yuesongtian commented 4 years ago

I meet the same problem as you. It is due to "Discriminator projection". Change the output of the discriminator from, out = out + torch.sum(self.embed(y) * h, 1, keepdim=True) to out = out The solution can alleviate this issue. However, the convergence speed is not comparable to original BigGAN.

entrpn commented 4 years ago

Thank you, that worked. At least I'm getting some values in there now.

yuesongtian commented 4 years ago

Hi, Discriminator projection is not the key. It hampers the training due to the uncorret hdf file. You can output the labels of real images. and you will find all labels are 0. Re-produce hdf file correctly and the training will be normal. Best regards. 发自我的华为手机-------- 原始邮件 --------发件人: Juan Acevedo notifications@github.com日期: 2020年7月2日周四 清晨6:46收件人: ajbrock/BigGAN-PyTorch BigGAN-PyTorch@noreply.github.com抄送: yuesongtian yuesongtian@163.com, Comment comment@noreply.github.com主 题: Re: [ajbrock/BigGAN-PyTorch] D loss is zero from the start (#66) Thank you, that worked. At least I'm getting some values in there now.

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe.

entrpn commented 4 years ago

Thanks. I'll give it a shot.

Baran-phys commented 3 years ago

So, I did not understand the solution. Was removing the "Discriminator projection" solved the ultimate problem? I mean seeing only numbers is not a solution. It might hamper the training. My d_loss for both fake and real is 0 after the first 20 iterations. What is the list of solutions? Thank you, guys.