Closed erdavis1 closed 3 years ago
@erdavis1 you are working in the realm of small data, so i would suggest turning on augmentation! you can do so with --aug-prob 0.25
@erdavis1 there actually isn't any contrastive regularization in unet-stylegan2, i removed it since it doesn't fit within the unet discriminator framework
@lucidrains Thanks! I'm very new to this, if it isn't obvious.
I tried the following and also got pretty quick mode collapse.
!unet_stylegan2 --data jpegs/ --attn-layers [1,2] --aug-prob 0.25
9k iterations:
I feel like drastically increasing the batch size has given me the best results of what I've tried, but I'm really struggling to find a balance where my results are good+diverse, but it doesn't take >15secs per iteration to train.
Are there any other tricks to consider, aside from batch size?
Have you tried plain stylegan2, without the unet? github.com/lucidrains/stylegan2-pytorch
I have. with very similar results
@erdavis1 try with augmentation, but with an effective batch size of 64. That means just a batch size of 32 and the gradient accumulate every of 2
Thanks! I'll try running that today and report back.
The following ended up in a very quick mode collapse, unfortunately:
unet_stylegan2 --data jpegs/ --attn-layers [1,2] --batch-size 32 --gradient-accumulate-every 2 --aug-prob 0.25
1k:
20k:
For my knowledge, what differences would you expect to see in a model trained with an effective batch size of 64 (batch size 32, gradient accumulate every 2) vs. one also with effective batch size 64 (batch size 8, gradient accumulate every 8)
@erdavis1 i updated the augmentation code in v0.5.0 if you'd like to give that a try. --aug-prob 0.25 --aug-types [translation,cutout,color]
Good news and bad news.
Using the following, I experienced less mode collapse!
unet_stylegan2 --data jpegs/ --attn-layers [1,2] --batch-size 32 --gradient-accumulate-every 2 --aug-prob 0.25 --aug-types [translation,cutout,color]
It partially collapsed around 18k:
and fully collapsed around 21k:
The images do seem to be a bit more smeary/blurry than the test with a higher gradient-accumulate-every, but I'm not sure if that's related to that, or to the augmentation.
This is a big improvement, and ran pretty quickly at 2secs/iteration. I'm wondering if I slowly edge up the effective batch size, I'll hit upon a level where there's less collapse and clearer results without sacrificing too much speed. I am in no rush, and if I get nice results I'll wait as long as I need to.
@erdavis1 get more data! lol
Haha, if only I could!
I ended up being pretty happy with the results of this run, even though it was fairly slow (~7 sec/iter if I got a V100)
!unet_stylegan2 --data jpegs/ --attn-layers [1,2] --batch-size 32 --gradient-accumulate-every 8 --aug-prob 0.25 --aug-types [translation,cutout,color]
Closing the issue since I'm satisfied with the output.
17k:
🚀🚀
Hi! First off, thank you for being so engaged and active with folks' issues.
I'm training on a dataset of 3k Japanese flag logos, plus their rotations of 90, 180, and 270 degrees for 12k total images.
I generally experience very quick mode collapse, often within the first 5k iterations. The one result that it does produce is pretty nice, though!
unet_stylegan2 --data jpegs/ --attn-layers [1,2]
39k iterations:Increasing the batch size and gradient accumulate every has reduced it to partial mode collapse. The catch is I'm training on colab, and this can be unworkably slow depending on the GPU (30 secs per iteration)
unet_stylegan2 --data jpegs/ --batch-size 32 --gradient-accumulate-every 8 --cl-reg --attn-layers [1,2]
24k iterations
Do you have any tuning suggestions for increasing speed and avoiding mode collapse? The results at 24k (above) are promising, but definitely need a lot more refining!