Is there any way to make this work with a 3D avatar object?

Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

https://synclabs.so

10.22k stars 2.2k forks source link

Is there any way to make this work with a 3D avatar object? #473

Open PrashantSaikia opened 1 year ago

PrashantSaikia commented 1 year ago

This is not an issue per se, but more of a feature request. I have made it work with 2D images of people by converting the images into static videos of the same length as the target audio's length. Now, I want to make it work with 3D "images" of people (let's call it a 3D avatar). Usually the avatar can be made in Unity or Blender or Unreal engine, and it comes in the form of an object file. Specifically, in my case, it is a .glb object. Any idea of how can I get it to work with this?

alessandroperetti commented 1 year ago

following the thread for tips..

fz5400 commented 1 year ago

I recently found a new model called SadTalker, the author has just updated the 3D face generation function, maybe this will meet your needs.

PrashantSaikia commented 1 year ago

@fz5400 Thanks for sharing this repo. I think the results look awesome! Generally, from what I had seen thus far, the animations done using ML/AI algos always contain some artefacts (weird head movements, blur, etc). But the demos in SadTalker look much better, with the least amount of artefacts I've seen so far.

But I wonder how the frontend avatar in this, for example, was built: https://huggingface.co/spaces/CVPR/ml-talking-face You can see that even if you select the hand movements here, they appear kinda natural without weird artefacts. Any idea how to create animations like this?

alessandroperetti commented 1 year ago

I have found VOCA. You can get a video or mesh of a flame ply template as output. There is also a VOCA Blender plugin that is ready to use