Open oisilener1982 opened 2 weeks ago
Thank you for your interest! You can check some details about the audio driven control in the supplementary materials of our paper, where we have included the relevant experimental results. @oisilener1982
Is it available right now or if not any estimated date of release? Im having fun with Liveportrait. It is so fast unlike other projects
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Please build audio driven talking avatar.
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Could use an audio to 3dmm result as driver or another lipsync tool
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
The experiment results can be found in appendix.C of the paper.
Due to some limitations, we are sorry that we are unable to provide this model. But you can follow the description in the appendix.C to train an audio driven model by yourself :) @nitinmukesh @oisilener1982
i am just an ordinary user :( I only learned something by following the tutorials from youtube (newgenai). I might just subscribe to Hedra and combine it with sadtalker but it would be nice if there would be a talking avatar because this project is really fast. Even faster than sadtalker
Is it just now or there is really no possibility of having a talking head like sadtalker or hedra?
or will this be another project? C. Audio-driven Portrait Animation We can easily extend our video-driven model to audio-driven portrait animation by regressing or generating motions, including expression deformations and head poses, from audio inputs. For instance, we use Whisper [58] to encode audio into sequential features and adopt a transformer-based framework, following FaceFormer [59], to autoregress the motions
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Could use an audio to 3dmm result as driver or another lipsync tool
Where?
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Could use an audio to 3dmm result as driver or another lipsync tool
Where?
could use my repo lipsick or dinet might be better for this, or wait for a expressive 3dmm like media2face
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Could use an audio to 3dmm result as driver or another lipsync tool
Is media2face real time or near real time like live portrait? If so we can build the pipeline much easier
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Could use an audio to 3dmm result as driver or another lipsync tool
Is media2face real time or near real time like live portrait? If so we can build the pipeline much easier
cant remember that paper was a while ago, the issue is getting a good one with a license that you need it for but generally they are fast
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Could use an audio to 3dmm result as driver or another lipsync tool
Is media2face real time or near real time like live portrait? If so we can build the pipeline much easier
cant remember that paper was a while ago, the issue is getting a good one with a license that you need it for but generally they are fast
Was codetalker the sota earlier on audio to 3DMM? https://github.com/Doubiiu/CodeTalker we may try that too. Ultimately I’m waiting for something like vasa-1.
I scanned the PDF paper and i cant find audio driven control of the face just like in sadtalker or hedra wherein we just input the image and audio then generate a talking avatar
Could use an audio to 3dmm result as driver or another lipsync tool
Is media2face real time or near real time like live portrait? If so we can build the pipeline much easier
cant remember that paper was a while ago, the issue is getting a good one with a license that you need it for but generally they are fast
Was codetalker the sota earlier on audio to 3DMM? https://github.com/Doubiiu/CodeTalker we may try that too. Ultimately I’m waiting for something like vasa-1.
I've kept a distant eye on 3dmm's and watched the project demo's and starred every-time I found one but it's only from today I am looking at whats available with a good license, although I'm still keeping an eye on emotional lip-sync papers to drive they just don't seem to have a good enough audio to lip fidelity, are you on my Discord inbox Tony? I see you did the replicate for Lipsick we might be doing the same thing here we should talk just in case Discord: Inferencer I have sourced an audio model with multi language support to drive liveportrait but it uses hubert which has a bad license, I don't like deepspeech either, its ok for american male spoken words but not much else https://github.com/user-attachments/assets/70f9ff50-8105-4d29-99c7-62b0b31f46af
Just wondering if there is any hope of having this project be used to create talking avatar that is audio driven. Im having fun with this proect but it would be nice to have talking heads