Closed wingjoezhou closed 2 months ago
Before proceeding, please ensure you have reviewed the notes section in our repository: https://github.com/TMElyralab/MuseTalk?tab=readme-ov-file#note. It contains important information that will be beneficial for your understanding.
For online chatting, MuseTalk operates in real-time by utilizing only the UNet and the VAE decoder. These components require 32ms/frame when run on an NVIDIA Tesla V100.
The VAE encoder latent is pre-saved, which means the computation time prior to line #L90 can be disregarded. You can refer to this here: https://github.com/TMElyralab/MuseTalk/blob/main/scripts/inference.py#L90
The mask_image is solely dependent on the original image, allowing it to be obtained in advance to reduce computation time. More details can be found here: https://github.com/TMElyralab/MuseTalk/blob/main/musetalk/utils/blending.py#L54 Please feel free to reach out if you have any further questions or concerns.
有计划推出实时推理的示例吗
有计划推出实时推理的示例吗
实时推理代码示例已更新 https://github.com/TMElyralab/MuseTalk?tab=readme-ov-file#new-real-time-inference
4090 显卡,现在 15秒语音,用语音驱动,转换一次,耗时大概2分钟。