HumanAIGC / AnimateAnyone

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
Apache License 2.0
14.23k stars 952 forks source link

为什么要采用Reference Image的信息用Spatial-Attention和Cross-Attention编码到Pose Sequence的Denoising UNet里面的方法而不是反过来 #46

Open hxypqr opened 8 months ago

hxypqr commented 8 months ago

有什么数学解释说明采用Reference Image的信息用Spatial-Attention和Cross-Attention编码到Pose Sequence的Denoising UNet里面的方法比反过来更有优势吗

fenghe12 commented 5 months ago

没有什么数学原理吧 就是效果更好 现在没办法很好地从数学上解释