lucidrains / unet-stylegan2

A Pytorch implementation of Stylegan2 with UNet Discriminator
MIT License
203 stars 24 forks source link

Problems with parameters in face generation #1

Open rockdrigoma opened 4 years ago

rockdrigoma commented 4 years ago

Just tried to train a model with a 3.5K-face dataset but it seems strange. My command was this: unet_stylegan2 --batch_size 4 --network_capacity 32 --gradient_accumulate_every 8 --aug_prob 0.25 --attn-layers [1,2]

This is on the 14 epoch (14000)

14

Any suggestion to improve this or just wait a little bit more? I had better results at this point with regular stylegan2

lucidrains commented 4 years ago

@rockdrigoma ohh darn, what's the values for D: and G: at this point?

lucidrains commented 4 years ago

i have to admit, this repository is a bit experimental, and i'm not sure if it will work for datasets with less than 10k samples

rockdrigoma commented 4 years ago

@rockdrigoma ohh darn, what's the values for D: and G: at this point?

9% 13950/149000 [9:41:20<95:32:18, 2.55s/it]G: 0.84 | D: 0.71 | GP: 0.36 | PL: 0.50 | CR: 0.00 | Q: 0.00

lucidrains commented 4 years ago

@rockdrigoma hmmm, that looks fine actually... maybe let it sit overnight and see where it is at at 50k

rockdrigoma commented 4 years ago

@rockdrigoma hmmm, that looks fine actually... maybe let it sit overnight and see where it is at at 50k

What's the meaning of each metric?

lucidrains commented 4 years ago

@rockdrigoma the way it works in GANs is there are two AIs. One AI learns to generate, and the other learns to determine if the image is fake (generated) or real. They are pit against each other, and they both become really good at some point. The numbers monitor that they are evenly balanced. If one becomes too powerful, both stop learning

lucidrains commented 4 years ago

@rockdrigoma you want to see both numbers above 0, but also below 20

lucidrains commented 4 years ago

@rockdrigoma how did it go?

rockdrigoma commented 4 years ago

@rockdrigoma how did it go?

Epoch 32 1% 598/118000 [35:19<90:35:00, 2.78s/it]G: 2.40 | D: 4.00 | GP: 0.06 | PL: 1.17 | CR: 0.00

mr.jpg

32-mr

lucidrains commented 4 years ago

ohh darn :( could you try it without augmentation probs? so this framework actually already includes augmentation intrinsically (cutmix), and perhaps the differentiable random crops are making it too difficult for the generator

lucidrains commented 4 years ago

the numbers do look good though! perhaps we just aren't being patient, but i suspect the extra augmentation is making it too difficult

matigekunstintelligentie commented 4 years ago

default<data/>: 8%|█▏ | 11650/150000 [8:20:38<90:07:32, 2.35s/it]G: -338.38 | D: 82127.60 | GP: 75818352640.00 | PL: 0.00 | CR: 19274924032.00 default<data/>: 8%|█▏ | 11700/150000 [8:22:37<89:36:36, 2.33s/it]G: 3379234816.00 | D: 35068168.00 | GP: 31833282560.00 | PL: 0.00 | CR: 13639467512365056.00 These values seem extremely large. Should I just be patient or is something going wrong? I used the configuration you provided in this repository except that I increased the resolution from 128x128 to 256x256 and lowered the batch size to 2 (to fit on my gpu). The latest output looks likes this: 11

jtremback commented 3 years ago

Here's my experience: On a dataset of 3,200 gemstones, lightweight-gan and stylegan2 both produce recognizable, but not great results after 12 epochs.

For example, stylegan2 after 13 epochs:

13

Unet-stylegan2 with --aug-prob 0.25 --top-k-training produces stuff like this:

12

After taking off --aug-prob 0.25 --top-k-training, and training a few more epochs, I get stuff like this:

16

It seems like unet-stylegan2 indeed does not work well with a small amount of data even with augmentation, and removing augmentation makes it worse.

JinshuChen commented 3 years ago

default<data/>: 8%|█▏ | 11650/150000 [8:20:38<90:07:32, 2.35s/it]G: -338.38 | D: 82127.60 | GP: 75818352640.00 | PL: 0.00 | CR: 19274924032.00 default<data/>: 8%|█▏ | 11700/150000 [8:22:37<89:36:36, 2.33s/it]G: 3379234816.00 | D: 35068168.00 | GP: 31833282560.00 | PL: 0.00 | CR: 13639467512365056.00 These values seem extremely large. Should I just be patient or is something going wrong? I used the configuration you provided in this repository except that I increased the resolution from 128x128 to 256x256 and lowered the batch size to 2 (to fit on my gpu). The latest output looks likes this: 11

I am facing quite the same problem as yours. Just a few iterations and the losses come to: d: 19807393792.000; g: 5749469.500; r1: 103350984704.000 So any ideas? TAT

sandrodevdariani commented 1 year ago

Do you guys have any update on that issue?