ThioJoe / Auto-Synced-Translated-Dubs

Automatically translates the text of a video based on a subtitle file, and then uses AI voice services to create a new dubbed & translated audio track where the speech is synced using the subtitle's timings.
GNU General Public License v3.0
1.56k stars 156 forks source link

Facebook Research Seamless - a massive free multilingual machine translation model #81

Open iGerman00 opened 9 months ago

iGerman00 commented 9 months ago

Recently, Facebook Research released/updated Seamless Communication. Among other things like general translation and good quality direct speech-to-speech (without preserving the tone and sound of the voice) in a massive amount of languages, they also support voice cloning with their Expressive model family. While it claims to only support 4 languages, it doesn't actually care that much what the input language is, so at least you can seemingly 'clone' the voice into 4 languages - German, Spanish, English and French.

@ThioJoe I believe this would be an amazing addition to this project and would propel it into the stratosphere in terms of quality. Please consider looking into this. Like all recent Facebook Research projects, it supports HuggingFace Transformers, and should be incredibly easy to use in Python, which this project is already written in. It also seems to be fast enough to not be a dealbreaker, and might replace entire parts and modules in this project.

https://github.com/facebookresearch/seamless_communication