alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
918 stars 248 forks source link

Fix detect EOF #97

Closed duhow closed 3 years ago

duhow commented 3 years ago

As message content is binary, need to compare against binary string.

nshmyrev commented 3 years ago

Sorry, it used to work fine before? Anything you see wrong with the original version?

duhow commented 3 years ago

I'm just running this command, and had to make this change to work. Otherwise it would keep as partial message and won't quit.

(cat audio.wav; sleep 1; printf '{"eof" : 1}') | websocat --no-close --binary ws://192.168.10.8:2700 | jq -r 'select(.text) | .text'

Doing tests with Docker image alphacep/kaldi-es

nshmyrev commented 3 years ago

I do not think this is a proper way to use the server. Please check Python client on how to decode file properly.

duhow commented 3 years ago

The thing is, I'm using other programs, not Python, just to send the audio stream to the Vosk Server. Once i've sent the full content, I just send the EOF string, but the Vosk Server still would keep the connection open, not calling the FinalResult instead.

nshmyrev commented 3 years ago

Audio stream must be properly chunked an you need to call result as soon as you meet silence. You can not send the audio as a whole.

duhow commented 3 years ago

Based on other client-samples that are hosted in here, looks like it works that way? https://github.com/alphacep/vosk-server/blob/master/client-samples/php/asr-test.php

nshmyrev commented 3 years ago

Based on other client-samples that are hosted in here, looks like it works that way?

Feel free to submit another patch to fix it ;)

duhow commented 3 years ago

Tested again with current Docker image alphacep/kaldi-vosk-server:11c6cd9871f93ab880a9aad928bef644fa4ceefe55554b0ed5f14eb530f521f3 (May 30) and cannot reproduce anymore. Patch indeed seems to cause other error.

DEBUG:websockets.protocol:server > Frame(fin=True, opcode=1, data=b'{\n  "partial" : ""\n}', rsv1=False, rsv2=False, rsv3=False)
DEBUG:websockets.protocol:server < Frame(fin=True, opcode=1, data=b'{"eof" : 1}', rsv1=False, rsv2=False, rsv3=False)
ERROR:websockets.server:Error in connection handler
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/websockets/server.py", line 169, in handler
    yield from self.ws_handler(self, path)
  File "/asr_server.py", line 80, in recognize
    response, stop = await loop.run_in_executor(pool, process_chunk, rec, message)
  File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/asr_server.py", line 20, in process_chunk
    elif rec.AcceptWaveform(message):
  File "/usr/local/lib/python3.7/dist-packages/vosk-0.3.29-py3.7.egg/vosk/__init__.py", line 64, in AcceptWaveform
    return _c.vosk_recognizer_accept_waveform(self._handle, data, len(data))
TypeError: initializer for ctype 'char *' must be a bytes or list or tuple, not str
DEBUG:websockets.protocol:server ! failing WebSocket connection in the OPEN state: 1011 [no reason]
DEBUG:websockets.protocol:server - state = CLOSING
DEBUG:websockets.protocol:server > Frame(fin=True, opcode=8, data=b'\x03\xf3', rsv1=False, rsv2=False, rsv3=False)
DEBUG:websockets.protocol:server - event = connection_lost(None)
DEBUG:websockets.protocol:server - state = CLOSED
DEBUG:websockets.protocol:server x code = 1006, reason = [no reason]
DEBUG:websockets.protocol:server x half-closing TCP connection