Accelerated Version of infer_audio2vid.py with Poor Results After Adjustments

Hello,

I've noticed that there is an official accelerated version for infer_audio2vid_pose.py, but not for infer_audio2vid.py. The acceleration seems to be achieved primarily by adjusting the step and CFG hyperparameters. I attempted to replicate the approach used in infer_audio2vid_pose to modify infer_audio2vid and loaded the accelerated (acc) model version. However, the results were unsatisfactory.

Could you please provide some insight into why this might be happening? Are there specific considerations or additional modifications required for accelerating infer_audio2vid that are not covered by simply adjusting steps and CFG parameters?

Any guidance on how to properly accelerate infer_audio2vid while maintaining good performance would be greatly appreciated.

Thank you!

BadToBest / EchoMimic

Accelerated Version of infer_audio2vid.py with Poor Results After Adjustments #78