ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
32.91k stars 3.29k forks source link

SeamlessM4T: high quality text-to-speech, speech-to-text and speech-to-speech multilingual #1476

Open nortekax opened 7 months ago

nortekax commented 7 months ago

The following model is a great high quality model supporting:

It also allows multilingual translation in all these modes. Would it be possible to make a gguf model and add support to whisper.cpp for it? You can try it here: https://huggingface.co/spaces/facebook/seamless_m4t

Alumniminium commented 7 months ago

Can't really agree with the 'high quality'... all my tests on that hugging space ended up in the model repeating itself over and over again, the voices do not sound good either.

nortekax commented 7 months ago

@Alumniminium , I did some more testing and also noticed some problems with SeamlessM4T. I was trying to have better Text-to-Speech locally, and I found this very good solution, please try it to see what you think:

https://github.com/rhasspy/piper

aehlke commented 6 months ago

@Alumniminium and with v2?

yayoimizuha commented 6 months ago

It seems official ggml implementation. https://github.com/facebookresearch/seamless_communication/tree/main/ggml