MRzzm / DINet

The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."
895 stars 167 forks source link

AI Dubbing API (with multi-speaker lipsync) #119

Open mvoodarla opened 2 weeks ago

mvoodarla commented 2 weeks ago

Thank you for building this project! I work at a company called Sieve and this is a part of what inspired us to build our Dubbing API. It's a bit different than this as it's the dubbing portion of things which supports voice cloning, different voice engines, and higher quality translations using other closed-source solutions but it's an example of the bounds of what this tech can do today.

I'd love to contribute our learnings in some way to this project. I think the most challenging part of the lipsync problem is 1. the quality and 2. the way you support multiple speakers and figure out who to sync onto.

Curious if we can contribute some of our work around this in some way to this project, or if there are improvements in mind to support multi-speakers with DINet? Would also love feedback on the lipsync that's integrated into our application today (video retalking based) and would love to contribute on multi-speaker support if there is community interest.

tailangjun commented 2 weeks ago

Highly anticipating