alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
871 stars 241 forks source link

Error in from_file: Expecting value: line 1 column 1 (char 0) #245

Open lionnet1981 opened 4 months ago

lionnet1981 commented 4 months ago

Could someone please check my code to see if it's correct? I've tried multiple approaches, but I'm still encountering errors. I think I'm not familiar with the Vosk WebSocket format, which is why the error persists. my dialplan: [outgoing] exten => 100,1,Answer() same => n,MixMonitor(/tmp/${UNIQUEID}.wav,r) same => n,BackGroundDetect(welcome-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand-8khz,1000,500) same => n,WaitExten(2)

exten => talk,1,Noop(Talk detection completed....) same => n,StopMixMonitor() same => n,System(sox /tmp/${UNIQUEID}.wav /tmp/${UNIQUEID}-1.wav silence 1 0.1 1%) same => n,AGI(speech_to_text.py,/tmp/${UNIQUEID}-1) same => n,Noop(The Intent : ${intent} EntityName : ${entity_name} EntityValue : ${entity_value})

same => n,System(rm /tmp/${UNIQUEID}.wav)

same => n,System(rm /tmp/${UNIQUEID}-1.wav)

same => n,Gotoif($[${LEN(${intent})}>0]?goto_intent:goto_again) same => n(goto_intent),Goto(${intent},s,1) same => n(goto_again),Goto(${CONTEXT},${exten},1)

exten => t,1,Noop(Time out happened...) same => n,System(rm /tmp/${UNIQUEID}.wav) same => n,Goto(${CONTEXT},${exten},1) same => n,Hangup()

[book_appointment] exten => s,1,Answer() same => n,MixMonitor(/tmp/${UNIQUEID}.wav,r) same => n,BackGroundDetect(surely_i_can_help-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand_general-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand_general-8khz,1000,500) same => n,WaitExten(2)

my speech_to_text.py:

!/usr/bin/env python3

import sys import os import json import time import requests from asterisk.agi import * from websocket import create_connection from pydub import AudioSegment import base64 import constants as ct import traceback

AUDIO_FD = 3 CONTENT_TYPE = 'audio/l16; rate=8000; channels=1' ACCEPT = 'audio/pcm'

agi = AGI()

agi.verbose("Entering into Python AGI...") agi.answer() file_name = sys.argv[1]+'.wav'

agi.verbose('Recording FileName is %s' % file_name)

def process_chunk(agi, ws, buf): try: agi.verbose("Processing chunk") agi.verbose("Data type: " + str(type(buf))) # Debug line ws.send_binary(buf) time.sleep(0.1) # wait for 100 milliseconds response = ws.recv() while not response: # wait for a non-empty response response = ws.recv() result = json.loads(response) return result except Exception as e: agi.verbose('Error in process_chunk: ' + str(e)) return None

def from_file(file_name): try: ws = create_connection("ws://localhost:2700") ws.send('{ "config" : { "sample_rate" : 8000 } }') agi.verbose("Connection created") result = '' with open(file_name, 'rb') as f: while True: data = f.read(8000) if len(data) == 0: break result = process_chunk(agi, ws, data) ws.send("EOS") final_result = json.loads(ws.recv()) ws.close() if 'text' in final_result: return final_result['text'] else: return '' except Exception as e: agi.verbose('Error in from_file: ' + str(e)) return None

Classify intent and entity with Rasa query

def get_intent(data): try: import requests import json url = "http://"+ct.RASA_HOST+"/model/parse" intent = requests.post(url, data = '{"text":"'+data+'"}') intent = json.loads(intent.text) intent_name = intent['intent']['name'] if len(intent['entities'])>0: data = intent['entities'] entity_name = data[0]['entity'] entity_value = data[0]['value'] else: entity_name = 'none' entity_value = 'none' return intent_name+':'+entity_name+':'+entity_value except Exception as e: agi.verbose('Error in get_intent: ' + str(e)) return None

try: data = from_file(file_name) if data is not None: agi.verbose('The Detected Data: %s' %data) data = get_intent(data) if data is not None: data = data.split(':') intent_name = data[0] entity_name = data[1] entity_value = data[2] agi.set_variable('intent',intent_name) agi.set_variable('entity_name',entity_name) agi.set_variable('entity_value',entity_value) else: agi.verbose('Could not get intent from Rasa server') agi.set_variable('intent','default_intent') else: agi.verbose('Could not get data from Vosk server') agi.set_variable('intent','default_intent') except Exception as e: agi.verbose('Unexpected error: ' + str(e)) agi.set_variable('intent','default_intent')

my log console asterisk: speech_to_text.py,/tmp/1708808146.0-1: Entering into Python AGI... speech_to_text.py,/tmp/1708808146.0-1: Recording FileName is /tmp/1708808146.0-1.wav speech_to_text.py,/tmp/1708808146.0-1: Connection created speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Error in from_file: Expecting value: line 1 column 1 (char 0) speech_to_text.py,/tmp/1708808146.0-1: Could not get data from Vosk server -- <PJSIP/200-00000000>AGI Script speech_to_text.py completed, returning 0 -- Executing [talk@outgoing:5] NoOp("PJSIP/200-00000000", "The Intent : default_intent EntityName : EntityValue : ") in new stack

ls -l /tmp total 268 -rw-r--r-- 1 asterisk asterisk 133826 Feb 24 20:55 1708808146.0-1.wav -rw-r--r-- 1 asterisk asterisk 135724 Feb 24 20:55 1708808146.0.wav