alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
8.06k stars 1.11k forks source link

Translation Function and Recognizing Uncommon Voice Types #1239

Open ehobson opened 1 year ago

ehobson commented 1 year ago

I just wanted to comment that Vosk is absolutely exceptional. The transcriptions from Whisper were highly inaccurate, but Vosk is very accurate.

Two issues - Whisper also has a function to translate speech. Could Vosk implement this? That would make it's offline capabilities all the more astounding.

Second, there is a female voice in the Russian language that I am trying to transcribe, and the poor woman seems to confound all speech recognition AIs. Even Vosk struggles with her. She has a lower range voice (approximately 165 Hz) and it seems that most AIs aren't used to deeper female voices. Is training the AI to recognize less common vocal types something the Vosk team can look into?

nshmyrev commented 1 year ago

We have no plans to add translation, sorry. It is a whole different area.

You are welcome to provide audio samples to let us check accuracy issues.