BadToBest / EchoMimic

Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
https://badtobest.github.io/echomimic.html
Apache License 2.0
2.68k stars 315 forks source link

Accelerated Version of infer_audio2vid.py with Poor Results After Adjustments #78

Closed pia-ai closed 2 months ago

pia-ai commented 2 months ago

Hello,

I've noticed that there is an official accelerated version for infer_audio2vid_pose.py, but not for infer_audio2vid.py. The acceleration seems to be achieved primarily by adjusting the step and CFG hyperparameters. I attempted to replicate the approach used in infer_audio2vid_pose to modify infer_audio2vid and loaded the accelerated (acc) model version. However, the results were unsatisfactory.

Could you please provide some insight into why this might be happening? Are there specific considerations or additional modifications required for accelerating infer_audio2vid that are not covered by simply adjusting steps and CFG parameters?

Any guidance on how to properly accelerate infer_audio2vid while maintaining good performance would be greatly appreciated.

Thank you!

JoeFannie commented 2 months ago

An official accelerated version for infer_audio2vid.py is released now.