音频与关键点作为条件驱动问题

BadToBest / EchoMimic

Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

https://badtobest.github.io/echomimic.html

Apache License 2.0

2.26k stars 263 forks source link

Closed gobigrassland closed 1 month ago

gobigrassland commented 1 month ago

本论文是否支持，将音频与嘴部关键点序列共同作为条件进行驱动生成视频？考虑到音频内在的嘴部运动与提供嘴部关键点序列可能不匹配，导致合成视频的嘴部运动混乱。

JoeFannie commented 1 month ago

如果同时输入，pose的驱动会占主导。