Concern about the validation of training strategy

Natsu6767 / DCGAN-PyTorch

PyTorch Implementation of DCGAN trained on the CelebA dataset.

103 stars 32 forks source link

Concern about the validation of training strategy #2

Open lyf1212 opened 1 year ago

lyf1212 commented 1 year ago

Are you sure you make a right training of you model? In my view, the loss jitters to much. Can you spend a little time to fix it?

rafa-br34 commented 2 months ago

I have the same problem here but I wrote my code based on the pytorch example and it still doesn't work. From my limited understanding, the discriminator loss should always be quite low but never 0 (otherwise the generator can't learn) and will slowly increase as the generator improves. For example, here's the graph on the original pytorch tutorial sphx_glr_dcgan_faces_tutorial_002 looks perfect right? welp, here's what I get I don't know what to do.....

lyf1212 commented 1 month ago

I have the same problem here but I wrote my code based on the pytorch example and it still doesn't work. From my limited understanding, the discriminator loss should always be quite low but never 0 (otherwise the generator can't learn) and will slowly increase as the generator improves. For example, here's the graph on the original pytorch tutorial looks perfect right? welp, here's what I get I don't know what to do.....

As it's known well, GAN training is confusing and painful sometimes. If you still want to try, adjust learning rate, optimizer hyperparameters or other settings, and print necessary logs during training, not only loss values, but also image outputs. But in my opinion, DCGAN indeed made a remarkable progress, it;s meaningless to retrain it with limited computational resource. FYI.

rafa-br34 commented 1 month ago

As it's known well, GAN training is confusing and painful sometimes. If you still want to try, adjust learning rate, optimizer hyperparameters or other settings, and print necessary logs during training, not only loss values, but also image outputs. But in my opinion, DCGAN indeed made a remarkable progress, it;s meaningless to retrain it with limited computational resource. FYI.

I agree, the only reason why I'm trying to train a DCGAN is that (in my opinion) it seems to be the best way to get hands-on experience with adversarial loss, most modern architectures use this type of loss (high-fidelity VAEs for instance), but I can't deny that I was much more successful at training a barely functional SD model than any GANs so far lol.