Please consider implementing Meta's open source Massively Multilingal Speech (MMS) with speech recognition and generation support for over 1000 languages with a drastically reduced error rate compared to Open AI's Whisper. As GPT4All is the most accessible local LLM/AI installer, adding speech transcription and text to speech would be a real boon for many.
MMS is better than OpenAIs whisper in important ways - the error rate is less than half:
at the same time MMS can understand about 4000 and output speech in over 1000 languages.
This opens up open, private and local usages in many areas such as voice based interaction with GPT4Alinterview transcription, language learning, increased accessibility.
Feature request
Please consider implementing Meta's open source Massively Multilingal Speech (MMS) with speech recognition and generation support for over 1000 languages with a drastically reduced error rate compared to Open AI's Whisper. As GPT4All is the most accessible local LLM/AI installer, adding speech transcription and text to speech would be a real boon for many.
https://github.com/facebookresearch/fairseq/tree/main/examples/mms
https://ai.facebook.com/blog/multilingual-model-speech-recognition/
Motivation
MMS is better than OpenAIs whisper in important ways - the error rate is less than half:
at the same time MMS can understand about 4000 and output speech in over 1000 languages.
This opens up open, private and local usages in many areas such as voice based interaction with GPT4Alinterview transcription, language learning, increased accessibility.
Your contribution
testing