Open ghost opened 3 years ago
I think that this is a normal phase in training when the generator's initial progress cannot be reflected by the FID score.
I don't think the black patches associated with the FID increase are normal. Also, initially, FID scores go down (baseline FID for this is 375):
network-snapshot-000000 time 1m 29s fid5k-train 329.7226
network-snapshot-000001 time 1m 28s fid5k-train 407.1180
How are the black patches? It can also be that the discriminator does not learn well at the beginning of training, so you can try training only the discriminator for a few iterations.
initially: black patchs: and afterwards what looks like early mode collapse to me:
Hard to say the exact reason... But I feel that the short spike won't affect the performance as I think it understandable that training can be pretty random at the beginning.
My final FID scores are too high and my final images are too blurry on this dataset. On a different dataset I did not experience this issue at all and the images came out great. So I really do think I'm losing most of my filters.
How does your dataset look like? It seems to me a more severe discriminator overfitting issue.
Each image looks something like this except the heights of the models are all the same:
And yes, I agree, it does seem like a severe discriminator overfitting issue -- how do I prevent it from overfitting?
Looks like a challenging dataset... I think the model will not learn well if there are only hundreds of such images. DiffAugment can reduce discriminator overfitting by some degree, while it is possible to further reduce the problem by making the augmentations stronger and adding some other augmentations (e.g. resize).
This is a plot of FID scores over time in ticks:
I'm wondering what is causing the initial spike upward.
I've tried a bunch of things to avoid it and the only thing that seems to work is setting the learning rate ridiculously low (1e-7).
The FID score increase appears to be associated with black patches in the images and a massive loss of filters in accordance with https://arxiv.org/abs/1908.03265.
The authors of the linked paper suggest a warmup schedule to avoid this but no warmup schedule seems to prevent the FID score increase.
Please advise.