odegeasslbc / FastGAN-pytorch

Official implementation of the paper "Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis" in ICLR 2021
GNU General Public License v3.0
600 stars 100 forks source link

Eval mode in eval.py #4

Closed atifemreyuksel closed 3 years ago

atifemreyuksel commented 3 years ago

Hi @odegeasslbc,

Thank you for your amazing work. I have a question about why you commented out the line of taking the model into eval mode. As I remember correctly, the eval mode is not used in test.py of pix2pixHD code.

Is it a good practice for gan models? If not, what is the specific reason of it since I really wonder about it? Also, new random noise is generated for each iteration in eval.py, so the generated images can be different even though model is in eval mode. Thank you for your answer :)

odegeasslbc commented 3 years ago

Hi

In my experimenting, I find that the eval mode for the Generator only changes the behavior of the batchnorm layer (as you might already know). I.e. the "eval" mode lets the batchnorm layer uses a running mean and running std that is stored during training, while the "train" mode uses the mean and std calculated from the current batch. Therefore, "eval" mode causes two problems in my case:

  1. Since I use an "exponential-move-average" optimizer to train the generator, the code to implement this EMA process does not store the "running mean" and "running std" in the batchnorm layer (see here https://github.com/odegeasslbc/FastGAN-pytorch/blob/2a2e8d9d13ac01111645502cc903b7bff0031dc4/train.py#L156-L157). Therefore, when using the generator after training, I can't use the eval mode, otherwise the generator will generate meaningless images because of the wrong "running mean and std" in the batchnorm layer.
  2. Even if I store a correct "running mean and std" for the batchnorm layer. For the GAN model, I still think (I hypothesis) it is preferred to use the mean and std calculated on the go from the current batch (especially when the batch size when testing is the same as when training the model). Because this is how the Generator is trained. I have tested this hypothesis: given the correct "running mean and std", using ".train()" mode always gives a slightly higher FID score compared to the ".eval()" mode for the Generator.
atifemreyuksel commented 3 years ago

Thank you for sharing your experiment results @odegeasslbc.