Misalignment between motion module and AnimateDiff

fudan-generative-vision / hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

MIT License

9.24k stars 1.27k forks source link

Open Nyquist0 opened 3 weeks ago

Nyquist0 commented 3 weeks ago

Dear Sir or Madam,

Great work, thanks for sharing. I would like to consult you about the misalignment I found between you motion module and AnimateDiff.

For AnimateDiff, the feature should be reshape to (bxhxw) x f x c as the following figure shows.

But in your code here, I found the feature is reshaped to (bxf) x (hxw) * c.

Is there anything I missed? Looking forward your reply. Thanks.

zypsjtu commented 1 week ago