SubtitleEdit / subtitleedit

the subtitle editor :)
http://www.nikse.dk/SubtitleEdit/Help
GNU General Public License v3.0
7.53k stars 852 forks source link

SeamlessM4T Support #7465

Closed sharadagg closed 9 months ago

sharadagg commented 9 months ago
          > Both NLLB versions work for me... I had to uninstall the "Faster Whisper" Python version (but that's okay as I use Purfview's Faster Whisper anyway).

Incidentally, https://winstxnhdw-nllb-api.hf.space/api/v2/translate still works for me in the previous beta.

OK, so it's up again. I've added it again, but it's probably better to use a local version.

Latest beta now has an option for "Auto start web server" to improve the UX experience: image

Link: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

Super! Absolutely amazing @niksedk

I have been experimenting with SeamlessM4T.. which opens up transcription, translations in many more langauges and speech generation. Public model is hosted here with http api access. https://replicate.com/cjwbw/seamless_communication/api

Anyone can use it for speech to text, text to text translation and text to speech. This will cover the whole cycle for us. We can potentially allow a user to create a fully dubbed audio through subtitle edit :)

User just would need to plugin their replicate.com access key. They can use a paid account - if they run out of their predictions quota on free account.

Originally posted by @sharadagg in https://github.com/SubtitleEdit/subtitleedit/issues/7457#issuecomment-1743606714

niksedk commented 9 months ago

How is the quality of auto text translate compared to Google Translate? How is the audio to text compared to Whisper?

I tried the docker version, but it failed after download about 10 GB data... does it require an Nvidia GPU? How did you run it?

What would you mainly want from SeamlessM4T? Text translate or audio-to-text or ?

niksedk commented 9 months ago

I was unable to get SeamlessM4T up and running properly via docker... but perhaps a future version... (I could start a task but the prediction/task seems to run forever)