yasinyazici / EMA_GAN

MIT License
30 stars 2 forks source link

help! #1

Open SeeU1119 opened 4 years ago

SeeU1119 commented 4 years ago

Sorry, because i am not familiar with the chainer framework, i dont understand your code very well, this trick does EMA operation on network parameters, i find that you somelikes create three generators, and just only updata one generator (like gen), and other generators like (gen_ema and gen_ma) just (soft)copy and EMA the parameters. In the test phase, you just use gen_ema or gen_ma to generate the fake image. Is that right? could you help me? thanks!

yasinyazici commented 4 years ago

Hi @osakaka . Yes, you got it right. There are three generators. gen is the usual generator in GANs which takes feedback from the discriminator. While, g_ema and g_ma does not directly take feedback from the discriminator. They are updated by keeping a type of average of gen. In the test phase, we use g_ema and g_ma to visualize and evaluate the model.

SeeU1119 commented 4 years ago

thanks, i will try it!

Crane-YU commented 4 years ago

Hi @yasinyazici , does it mean that you do not actually train the other 2 generators with name "g_ema" and "g_ma", what you actually do is that you train the basic generator and update the parameters of the other 2 networks according to the trained parameters from the basic generator by using Adma optimizer? Could you please verify the statement shown above. Thank you.

yasinyazici commented 4 years ago

Hi @Crane-YU . Yes, you update the generator as usual (with gradient optimizers), while g_ma and g_emaare updated w.r.t. Eq.1 and Eq.2 (check the paper), respectively. So, we do not "train" g_ma and g_ema with gradient optimizers (SGD, ADAM etc.), but with averaging techniques.

Crane-YU commented 4 years ago

Thank you so much.