alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
869 stars 241 forks source link

Why does transcription have a significant drop in quality while running over docker? #227

Closed Deuos closed 1 year ago

Deuos commented 1 year ago

While using the small english model version 0.15, I noticed that when downloading it and running locally the transcription is more accurate, than using WebSockets to pass the audio and running it through the docker container. Why does this happen? Is it because of websockets or the image itself?

nshmyrev commented 1 year ago

It is hard to guess, actually results must be identical

Deuos commented 1 year ago

Ok, redid tests using the official test16k.wav, the results were identical, the issue may have been with my audio examples for some reason. One question, the sample rate must be 16000 right? And the bit rate doesn't matter correct?

nshmyrev commented 1 year ago

Sample rate is usually 16000 yes, bitrate for 16bit per sample is 256kbps

Deuos commented 1 year ago

Ok thank you, that examples it.