Thank you for building this project! I work at a company called Sieve and this is a part of what inspired us to build our Dubbing API. It's a bit different than this as it's the dubbing portion of things which supports voice cloning, different voice engines, and higher quality translations using other closed-source solutions but it's an example of the bounds of what this tech can do today.
I'd love to contribute our learnings in some way to this project. I think the most challenging part of the lipsync problem is 1. the quality and 2. the way you support multiple speakers and figure out who to sync onto.
Curious if we can contribute some of our work around this in some way to this project, or if there are improvements in mind to support multi-speakers with DINet? Would also love feedback on the lipsync that's integrated into our application today (video retalking based) and would love to contribute on multi-speaker support if there is community interest.
Thank you for building this project! I work at a company called Sieve and this is a part of what inspired us to build our Dubbing API. It's a bit different than this as it's the dubbing portion of things which supports voice cloning, different voice engines, and higher quality translations using other closed-source solutions but it's an example of the bounds of what this tech can do today.
I'd love to contribute our learnings in some way to this project. I think the most challenging part of the lipsync problem is 1. the quality and 2. the way you support multiple speakers and figure out who to sync onto.
Curious if we can contribute some of our work around this in some way to this project, or if there are improvements in mind to support multi-speakers with DINet? Would also love feedback on the lipsync that's integrated into our application today (video retalking based) and would love to contribute on multi-speaker support if there is community interest.