alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 342 forks source link

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd3 in position 0: unexpected end of data #190

Closed bilguun0203 closed 5 years ago

bilguun0203 commented 5 years ago

I'm testing my Mongolian model which is using cyrillic alphabet. But UnicodeDecodeError exception raised when worker sending final hypotheses to the master server. And then master server is sending json response with empty utterance.

worker.log

2019-04-28 15:29:40 -   ERROR: tornado.application: Future exception was never retrieved: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/kaldi-gstreamer-server/kaldigstserver/worker.py", line 262, in _on_word
    self.send(json.dumps(event))
  File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd3 in position 0: unexpected end of data

master.log - Final hyp is empty

INFO 2019-04-28 15:31:21,625 b5c6b5ed-1472-49ef-9c88-a3e8f5e1016c: Receiving event {u'status': 0, u'segment': 0, u'result': {u'hypotheses': [{u'transcript': u'\u0445\u044d\u0440\u0... from worker 
INFO 2019-04-28 15:31:21,649 b5c6b5ed-1472-49ef-9c88-a3e8f5e1016c: Receiving event {u'status': 0, u'segment': 0, u'result': {u'hypotheses': [{u'transcript': u'\u0445\u044d\u0440\u0... from worker 
INFO 2019-04-28 15:31:21,662 b5c6b5ed-1472-49ef-9c88-a3e8f5e1016c: Receiving event {u'status': 0, u'segment': 0, u'result': {u'hypotheses': [{u'transcript': u'\u0445\u044d\u0440\u0... from worker 
INFO 2019-04-28 15:31:21,689 Worker <__main__.WorkerSocketHandler object at 0x7fc986439150> leaving 
INFO 2019-04-28 15:31:21,689 b5c6b5ed-1472-49ef-9c88-a3e8f5e1016c: Receiving 'close' from worker 
INFO 2019-04-28 15:31:21,689 b5c6b5ed-1472-49ef-9c88-a3e8f5e1016c: Final hyp:  
INFO 2019-04-28 15:31:21,690 200 PUT /client/dynamic/recognize (172.17.0.1) 1065.99ms 
INFO 2019-04-28 15:31:21,690 Everything done