Inference API for MuseTalk with improvements!

gaurangbharti1 commented 1 month ago

Hey guys, really cool work! I'm an engineer at Sieve and we've been working with lip-syncing tech for some time now. We were quite impressed by the capabilities of MuseTalk and thought we'd integrate a version of it in our product!

We have a setup for general lipsync here, built on our infrastructure, which allows interested users to try lipsyncing with a MuseTalk backend via web app, API, and through code. Would love to hear your guys' thoughts on it! We would also love to have a mention of it on the project's README to let interested users know!

I should mention that this is not raw MuseTalk - work has been done to make it more emotive as well as improve the facial restoration using other models, which I think can be quite useful for a lot of users, especially when combined with a robust infra like ours.

Happy to share more details if you'd like. Looking forward to hearing back!

xiankgx commented 1 month ago

@gaurangbharti1,

Hi, great work. May I know besides doing face restoration, what other enhance to MuseTalk have been implemented? In particular, what was done to make it "more emotive"?

xiankgx commented 1 month ago

Also, since you have implemented both video-retalking and musetalk and made musetalk the default backend model for your API, would you share some experience regarding their performance? Which do you think is better?

hamutama commented 1 month ago

Is your work open-source or you are just selling your product based on this free software and posing advertisement ?

gaurangbharti1 commented 1 month ago

@xiankgx hey, thanks! with respect to making it more emotive, we realized that the bbox shift had a lot of impact on how things looked, so we made some changes that allow it to be more dynamic and be computed frame-by-frame rather than be static and provided just once during inference. this when tied to some silence detections gives you much better lip movement that sticks to the audio better. we also do some blending with the restoration to allow for better naturalness in the face as only doing restoration can still cause artifacts to form and things to look out of place.

in our experience with musetalk and retalker, we realized that retalker occasionally suffers from noisier outputs but latches onto the speech better and has decent overall sync, while musetalk does a better job of maintaining facial fidelity and consistentcy while sporadically having subpar sync. our changes have helped with both of these models but there's still work to be done

gaurangbharti1 commented 1 month ago

@hamutama hey! our current focus is to get this app working the best we can, get feedback and make the improvements as needed while selling this complete lipsync solution, but we plan on open sourcing the backends in the coming months after we fix any major issues.

youtianhong commented 1 month ago

@gaurangbharti1,

Hey guys, really cool work! I'm an engineer at Sieve and we've been working with lip-syncing tech for some time now. We were quite impressed by the capabilities of MuseTalk and thought we'd integrate a version of it in our product!

We have a setup for general lipsync here, built on our infrastructure, which allows interested users to try lipsyncing with a MuseTalk backend via web app, API, and through code. Would love to hear your guys' thoughts on it! We would also love to have a mention of it on the project's README to let interested users know!

I should mention that this is not raw MuseTalk - work has been done to make it more emotive as well as improve the facial restoration using other models, which I think can be quite useful for a lot of users, especially when combined with a robust infra like ours.

Happy to share more details if you'd like. Looking forward to hearing back!

Thanks for your good sharing. Expert your open source project . May I ask if you guys have encountered this problem when inference with museTalk? https://github.com/TMElyralab/MuseTalk/issues/147 Doesn't it support multi-threaded inference?

gaurangbharti1 commented 1 month ago

Hey @youtianhong sorry for the late response. As of now we do not support multi-threaded inference as we also encountered some issues with it. If we do work on it some more and make some fixes I'll be sure to update the repository with that information!

jryebread commented 4 weeks ago

@gaurangbharti1 has the sieve version of muse talk been open sourced yet? any ETA?

TMElyralab / MuseTalk

Inference API for MuseTalk with improvements! #144