can anyone explain the classifier-free guidance in referencenetattention?

MooreThreads / Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)

Apache License 2.0

3.18k stars 250 forks source link

can anyone explain the classifier-free guidance in referencenetattention? #102

Open WeilunDai66 opened 8 months ago

WeilunDai66 commented 8 months ago

Why the unconditional predicted noise didn't use referencenet feature? Which may cause a gap between training and inference. When in training, we only dropout the hiddenstates, not the referencenet feature. However, in practice, we noticed that open classifier-free guidance in referencenetattention has better performance than not, the generated video has better color, can anyone explain it? Thanks.

zhengrchan commented 6 months ago

I'm confused too. In my case, infer imgs from custom stage1 model could be full of noise if the cfg_guidance_scale > 1. I'm trying to train with cfg.