Implement Massively Multilingual Speech - Meta's open speech model with speech recognition and TTS in over 1000 languages

Feature request

Please consider implementing Meta's open source Massively Multilingal Speech (MMS) with speech recognition and generation support for over 1000 languages with a drastically reduced error rate compared to Open AI's Whisper. As GPT4All is the most accessible local LLM/AI installer, adding speech transcription and text to speech would be a real boon for many.

https://github.com/facebookresearch/fairseq/tree/main/examples/mms

https://ai.facebook.com/blog/multilingual-model-speech-recognition/

Motivation

MMS is better than OpenAIs whisper in important ways - the error rate is less than half:

at the same time MMS can understand about 4000 and output speech in over 1000 languages.

This opens up open, private and local usages in many areas such as voice based interaction with GPT4Alinterview transcription, language learning, increased accessibility.

Your contribution

testing

nomic-ai / gpt4all