rsxdalv / tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
https://rsxdalv.github.io/tts-generation-webui/
MIT License
1.46k stars 160 forks source link

SeamlessM4T Using audio files to implement translation !Can you support it? #282

Open curui opened 3 months ago

curui commented 3 months ago

QQ截图20240310024054 Translate other languages via audio files like this!

rsxdalv commented 3 months ago

Hi, please try the newest update, I added a basic implementation for testing. Not all of the languages in the list are supported, and I don't have the S2TT, T2TT and ASR yet, I just added the S2ST functionality.

https://github.com/rsxdalv/tts-generation-webui/pull/284 127 0 0 1_7860_ (1)

curui commented 3 months ago

您好,请尝试最新的更新,我添加了一个基本的测试实现。并非列表中的所有语言都受支持,而且我还没有 S2TT、T2TT 和 ASR,我只是添加了 S2ST 功能。

第284章 127 0 0 1_7860_ (1)

thank u!You can refer to this https://replicate.com/adirik/seamless-expressive. They can recognize speech and translate the same timbre, just like heygen. They seem to use seamless. For example: Upload the audio of an American who can translate and speak Chinese with the same accent/tone

rsxdalv commented 3 months ago

Interesting, the results on replicate seem better than what I saw myself so far. Thank you! This gives more research to do.