Closed timendez closed 2 years ago
What you are seeing are the cropout augmentations. For RGB images, the cropout rectangles are black (0, 0, 0): https://github.com/lucidrains/lightweight-gan#basic-usage
For RGBA images, the cropout rectangles are transparent.
It might help to reduce the augmentation probability, e.g. set --aug_prob=????
to some value, although I have not tried it and do not know which values are good.
Another point of reference: I trained on a dataset of 2500 natural images of size 64x64 with default settings. After 30k iterations, the cropout augmentations already became apparent, so less augmentation is definitely needed, but for me that would probably lead to mode collapse because the dataset is relatively small. But your dataset is much larger, so it might work.
Thank you for the explanation @99991! That makes a lot of sense.
It seems unusual to me that [cutout,translation]
are enabled by default. Would you happen to know why, and what the repercussions are for removing all augmentations? Does it simply help for more variety in outputs? I have no need to generate any images with cutouts or translations, so my plan is to run it with no augmentations.
General recommendation is using suitable augs for your data and as many as possible, then after sometime of training disable most destructive (for image) augs.
This part of the README makes me think that they do help a lot with training, and that disabling them later on can help the training get "back on track" for more realistic output, while still being able to use the benefits of variety offered by augmentations. Does that sound right?
The problem is that with too little data, the variety of the generated images will "collapse" eventually. With augmentations, the size of the dataset is artificially increased, which seems to work around this problem in practice, but the augmentations could leak into the generated images.
For more details, see for example "Differentiable Augmentation for Data-Efficient GAN Training" https://arxiv.org/pdf/2006.10738.pdf
I had more success with StyleGAN2-ADA-PyTorch because it worked well right out of the box without parameter tuning.
I tried StyleGAN2-ADA-PyTorch based off of your suggestion and also had great success!
Thank you so much @99991 you've saved me a lot of time!
Hello! Training on Google Colab with
I'm at 250k iterations over the course of 5 days at 2s/it, and have gotten strange results with boxes.
I've circled some examples of this below.
My training data is 22k images of 256x256 .pngs that do not contain large hard edges or boxes like this. They're video game sprites with hard edges being limited to at most 10x10px
Are there any suggestions I can do with arguments in order to decrease the chance of the models learning that transparent boxes are good? Would converting to a white background help?
Thank you!