Doubiiu / CodeTalker

[CVPR 2023] CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
MIT License
515 stars 57 forks source link

Real time use ? #37

Closed tony-wolff closed 1 year ago

tony-wolff commented 1 year ago

Hello ! The work is impressive! I wonder if it would be proficient to use with real time generated TTS and produce realistic facial animation on a 3D face model in Unity.

Doubiiu commented 1 year ago

Hi We didn't test the model in a real-time setting. In principle, the model obtains the global context of the audio first (about 4s in our experiments) and then autoregressive synthesizes 3d facial animation. That means the performance may be dropped if you only provide a small window of audio in a real-time setting. This could be a limitation and need to be further explored. Previous works, e.g. VOCA and MeshTalk may be suitable for realtime applications as they adopted small audio windows in their methods.