alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
931 stars 249 forks source link

c#; webserver; Chinese interpretation is not accurate #190

Open CXUI opened 2 years ago

CXUI commented 2 years ago

After setting up the server according to the document,Parsing Chinese audio, the server returns different results; Same audio file, C# test and python test,The Python tests are correct. c# Test results: Result { "result" : [{ "conf" : 0.277180, "end" : 4.860000, "start" : 3.840000, "word" : "洗洗" }], "text" : "洗洗" } Result {"text": ""}

python Test results : { "result" : [{ "conf" : 1.000000, "end" : 1.260000, "start" : 0.660000, "word" : "今天" }, { "conf" : 1.000000, "end" : 1.890000, "start" : 1.260000, "word" : "晚上" }, { "conf" : 1.000000, "end" : 2.220000, "start" : 1.920000, "word" : "吃" }, { "conf" : 1.000000, "end" : 2.670000, "start" : 2.220000, "word" : "什么" }], "text" : "今天 晚上 吃 什么" }

nshmyrev commented 2 years ago

Sample rate seems wrong, server requires 8khz or you have to send config message