Can anyone reproduce the model?

bes-dev / MobileStyleGAN.pytorch

An official implementation of MobileStyleGAN in PyTorch

Apache License 2.0

672 stars 81 forks source link

Can anyone reproduce the model? #21

Closed melody-rain closed 3 years ago

melody-rain commented 3 years ago

I trained with this repo, but the loss became huge very soon, something like loss=3.18e+04.

bes-dev commented 3 years ago

@melody-rain we reproduced results several times in different experiments using this code. Can you provide more bug report in more details? Thanks in advance

melody-rain commented 3 years ago

@bes-dev hi. I just trained mobile-stylegan_ffhq-512(512 is the size of the generated images) with your code. The teacher model is the pytorch one converted from official tensorflow model. There was nothing changed any more. When training the loss is sometimes like loss=3.18e+04.

BTW, I added gradient_clip_val=1.0 to pl.Trainer(), the problem does not exist any longer, and I can successfully train mobile-stylegan_ffhq-512.

Many thanks to your work.

ZYzhouya commented 3 years ago

@bes-dev hi. I just trained mobile-stylegan_ffhq-512(512 is the size of the generated images) with your code. The teacher model is the pytorch one converted from official tensorflow model. There was nothing changed any more. When training the loss is sometimes like loss=3.18e+04.

BTW, I added gradient_clip_val=1.0 to pl.Trainer(), the problem does not exist any longer, and I can successfully train mobile-stylegan_ffhq-512.

Many thanks to your work.

Did you reproduce the desired results?

In my training, "kid_val' value decreases at the beginning, but it increases after sevel iterations and logs "kid_val was not in top True".

Many thanks. WAIT FOR YOUR REPLY.

@melody-rain @bes-dev

zt706 commented 3 years ago

@melody-rain we reproduced results several times in different experiments using this code. Can you provide more bug report in more details? Thanks in advance

Can you give an overview of your experimental environment? For example: cuda version torch version gcc version

When I use cuda10, the loss is easy to nan, but when I switch to cuda11, it is ok. @ZYzhouya

However, I used cuda11 + pytorch 1.7 + gcc5.4 + ubuntu16.04, and used your code and cfg file + 4 2080ti. After training for 5 days, it still can't reach your effect. Do you have any advice