nupurkmr9 / vision-aided-gan

Ensembling Off-the-shelf Models for GAN Training (CVPR 2022 Oral)
https://www.cs.cmu.edu/~vision-aided-gan/
MIT License
379 stars 26 forks source link

Training on 7k dataset encounters mode collapse and generator leakage #8

Open 49xxy opened 1 year ago

49xxy commented 1 year ago

Hi! Sorry to contact you frequently recently! I'm very interested in your work! I encountered some problems in the reproduction process. When I used my own 7k datasets to train with the following commands: python train.py --outdir=training-runs --data=datasets/face7k.zip --aug=ada --warmup=5e5 --cfg=paper256_2fmap --gpus=2 --kimg=5000 --batch=16 --snap=25 --cv-loss=multilevel_sigmoid_s --augcv=ada --cv=input-clip-output-conv_multi_level --metrics=none

![Uploading image.png…]()

The probability of the adaptive discriminator enhancement increased rapidly during the training process. At present, the quality of the generated samples is very poor. Compared with stylegan2 ada, whether the problem of mode collapse and generator leakage is very serious. I don't know what details I missed. I hope you can help me! Thank you again for answering my questions before!

nupurkmr9 commented 1 year ago

Hi, Thanks for your interest in our work. Can you provide more dataset details. Is this similar to FFHQ or its a more artistic style dataset? Also how does the images look like until warmup training? I am not able to see the currently uploaded image. It only shows me "Uploading image.png".

I always found the warmup of 5e5 to be sufficient but if the fake images till warmup iteration are quite worse then it will be good to try a longer warmup stage.

Depending on the dataset, ada augmentation strategy might not be good and changing it to diffAugment might help.

Let me know if these things does not address your issue.

Thanks

49xxy commented 1 year ago

Hello, from the beginning of training to 1200kimg, the generation quality has hardly improved and the generator has been affected by data enhancement in the middle of the process. image image image Thanks!

nupurkmr9 commented 1 year ago

Hi, Looks like the augmentation probability of original discriminator is above 1, which is usually when the discriminator is overfitting. Can you post the logs for discriminator loss?

Also does changing to diffAugment using the command below help in any way?

python train.py --outdir=training-runs --data=datasets/face7k.zip --diffaugment=color,translation,cutout --warmup=5e5 --cfg=paper256_2fmap --gpus=2 --kimg=5000 --batch=16 --snap=25 --cv-loss=multilevel_sigmoid_s --augcv=diffaugment-color,translation,cutout --cv=input-clip-output-conv_multi_level --metrics=none

Another thing that might be useful in this case is to resume from an FFHQ pre-trained model.

Thanks.

49xxy commented 1 year ago

Hello! 1)This is some data from my previous experiment: image 2)I am having this problem while running with diffaugment: image Thanks!

nupurkmr9 commented 1 year ago

Hi, Sorry for the delayed response. There was a typo in the above command for training with diffAugment. It should be:

python train.py --outdir=training-runs --data=datasets/face7k.zip --diffaugment=color,translation,cutout --warmup=5e5 --cfg=paper256_2fmap --gpus=2 --kimg=5000 --batch=16 --snap=25 --cv-loss=multilevel_sigmoid_s --augcv=diffaug-color,translation,cutout --cv=input-clip-output-conv_multi_level --metrics=none

Does finetuning from an FFHQ trained model with vision-aided-loss help?