alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
882 stars 243 forks source link

Language detection #201

Open charlez-700 opened 1 year ago

charlez-700 commented 1 year ago

Hey there, I'm working on a small project to transcript audio to text. I've found Vosk a great tool and I've been using it for a while.

Now I'm facing a problem where the audio I receive could be spoken in English or Portuguese. My first approach was to transcribe these audios with both language models so then I could decide which one to keep based on the average confidence level of the whole audio but from time to time in not completely clear Portuguese audios, Vosk ends up with a better confidence level of the English transcription than the Portuguese one.

Is there any better way to decide whether the transcription should I keep?

Thanks.