MooreThreads / Moore-AnimateAnyone

Character Animation (AnimateAnyone, Face Reenactment)
Apache License 2.0
3.05k stars 235 forks source link

Inconsistency of classifier-free guidance between training and testing. #109

Open TianpengBu opened 5 months ago

TianpengBu commented 5 months ago

HI, authors, Great work! My question about the implementation is as follows:

During training, I found that you randomly set 20% of CLIP's input as zeros tensors, image

however, during testing, you concatenate the output of clip embedding with zero tensors, like this: image

As far as I am concerned, to align the training and testing, should we randomly set 20% of the output of CLIP as zero tensors rather than the input of CLIP model?

jiangzhengkai commented 5 months ago

@TianpengBu I totally agree with you

abcdvzz commented 3 weeks ago

If the input is zero, the output should also be zero, right?