Why use g_ema to save model?

williamyang1991 / VToonify

[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer

Other

3.53k stars 442 forks source link

Why use g_ema to save model? #45

Closed undcloud closed 1 year ago

undcloud commented 1 year ago

Hello,sir, I am new here, I read the code, meet a problem , think a hundred times but get no work:

g_ema is used to generate image pair and it should be freezed:
https://github.com/williamyang1991/VToonify/blob/6154ac0ec309ff76a626461b1e72b230033c9ca4/train_vtoonify_d.py#L238

and generator is used to generate fake image and it should not be freezed :
https://github.com/williamyang1991/VToonify/blob/6154ac0ec309ff76a626461b1e72b230033c9ca4/train_vtoonify_d.py#L297

Question1:
Finally, we should get the weights of generator, but why save g_ema 's weights in the code? :
https://github.com/williamyang1991/VToonify/blob/6154ac0ec309ff76a626461b1e72b230033c9ca4/train_vtoonify_d.py#L387

Question2:
What is the effect of the function "accumulate"? Does it change g_ema's weights? Why it changes g_ema's weights?

thank you~

williamyang1991 commented 1 year ago

ema (Exponential Moving Average) is a common trick in deep learning, see here https://leimao.github.io/blog/Exponential-Moving-Average/

undcloud commented 1 year ago

thank you very much

---Original--- From: "Shuai @.> Date: Mon, Feb 6, 2023 12:05 PM To: @.>; Cc: @.**@.>; Subject: Re: [williamyang1991/VToonify] Why use g_ema to save model? (Issue#45)

ema (Exponential Moving Average) is a common trick in deep learning, see here https://leimao.github.io/blog/Exponential-Moving-Average/

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

undcloud commented 1 year ago

OK, I know ,

accumulate(g_ema.encoder, g_module.encoder, accum)

g_ema.encoder's weight is just the temporary variable of g_module.encoder's weight, g_ema is used to generate image, g_module is used to train vtoonify, Right?

torch.save(
                    {
                        "g_ema": g_ema.state_dict(),
                    },

it really confused me in the weekend.

williamyang1991 commented 1 year ago

accumulate(g_ema.encoder, g_module.encoder, accum)

$EMA_t=\alpha\Thetat+(1-\alpha)EMA{t-1}$

the input g_module.encoder is $\Theta_t$
the input g_ema.encoder is $EMA_{t-1}$
the updated g_ema.encoder after running accumulate() is $EMA_t$
accum is $1-\alpha$

undcloud commented 1 year ago

I get it, thanks a lot！ 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍