Could someone please check my code to see if it's correct? I've tried multiple approaches, but I'm still encountering errors. I think I'm not familiar with the Vosk WebSocket format, which is why the error persists.
my dialplan:
[outgoing]
exten => 100,1,Answer()
same => n,MixMonitor(/tmp/${UNIQUEID}.wav,r)
same => n,BackGroundDetect(welcome-8khz,1000,500)
same => n,BackGroundDetect(silence/5,1000,500)
same => n,BackGroundDetect(did_not_understand-8khz,1000,500)
same => n,BackGroundDetect(silence/5,1000,500)
same => n,BackGroundDetect(did_not_understand-8khz,1000,500)
same => n,WaitExten(2)
exten => talk,1,Noop(Talk detection completed....)
same => n,StopMixMonitor()
same => n,System(sox /tmp/${UNIQUEID}.wav /tmp/${UNIQUEID}-1.wav silence 1 0.1 1%)
same => n,AGI(speech_to_text.py,/tmp/${UNIQUEID}-1)
same => n,Noop(The Intent : ${intent} EntityName : ${entity_name} EntityValue : ${entity_value})
same => n,System(rm /tmp/${UNIQUEID}.wav)
same => n,System(rm /tmp/${UNIQUEID}-1.wav)
same => n,Gotoif($[${LEN(${intent})}>0]?goto_intent:goto_again)
same => n(goto_intent),Goto(${intent},s,1)
same => n(goto_again),Goto(${CONTEXT},${exten},1)
exten => t,1,Noop(Time out happened...)
same => n,System(rm /tmp/${UNIQUEID}.wav)
same => n,Goto(${CONTEXT},${exten},1)
same => n,Hangup()
[book_appointment]
exten => s,1,Answer()
same => n,MixMonitor(/tmp/${UNIQUEID}.wav,r)
same => n,BackGroundDetect(surely_i_can_help-8khz,1000,500)
same => n,BackGroundDetect(silence/5,1000,500)
same => n,BackGroundDetect(did_not_understand_general-8khz,1000,500)
same => n,BackGroundDetect(silence/5,1000,500)
same => n,BackGroundDetect(did_not_understand_general-8khz,1000,500)
same => n,WaitExten(2)
my speech_to_text.py:
!/usr/bin/env python3
import sys
import os
import json
import time
import requests
from asterisk.agi import *
from websocket import create_connection
from pydub import AudioSegment
import base64
import constants as ct
import traceback
try:
data = from_file(file_name)
if data is not None:
agi.verbose('The Detected Data: %s' %data)
data = get_intent(data)
if data is not None:
data = data.split(':')
intent_name = data[0]
entity_name = data[1]
entity_value = data[2]
agi.set_variable('intent',intent_name)
agi.set_variable('entity_name',entity_name)
agi.set_variable('entity_value',entity_value)
else:
agi.verbose('Could not get intent from Rasa server')
agi.set_variable('intent','default_intent')
else:
agi.verbose('Could not get data from Vosk server')
agi.set_variable('intent','default_intent')
except Exception as e:
agi.verbose('Unexpected error: ' + str(e))
agi.set_variable('intent','default_intent')
my log console asterisk:
speech_to_text.py,/tmp/1708808146.0-1: Entering into Python AGI...
speech_to_text.py,/tmp/1708808146.0-1: Recording FileName is /tmp/1708808146.0-1.wav
speech_to_text.py,/tmp/1708808146.0-1: Connection created
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Processing chunk
speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'>
speech_to_text.py,/tmp/1708808146.0-1: Error in from_file: Expecting value: line 1 column 1 (char 0)
speech_to_text.py,/tmp/1708808146.0-1: Could not get data from Vosk server
-- <PJSIP/200-00000000>AGI Script speech_to_text.py completed, returning 0
-- Executing [talk@outgoing:5] NoOp("PJSIP/200-00000000", "The Intent : default_intent EntityName : EntityValue : ") in new stack
ls -l /tmp
total 268
-rw-r--r-- 1 asterisk asterisk 133826 Feb 24 20:55 1708808146.0-1.wav
-rw-r--r-- 1 asterisk asterisk 135724 Feb 24 20:55 1708808146.0.wav
Could someone please check my code to see if it's correct? I've tried multiple approaches, but I'm still encountering errors. I think I'm not familiar with the Vosk WebSocket format, which is why the error persists. my dialplan: [outgoing] exten => 100,1,Answer() same => n,MixMonitor(/tmp/${UNIQUEID}.wav,r) same => n,BackGroundDetect(welcome-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand-8khz,1000,500) same => n,WaitExten(2)
exten => talk,1,Noop(Talk detection completed....) same => n,StopMixMonitor() same => n,System(sox /tmp/${UNIQUEID}.wav /tmp/${UNIQUEID}-1.wav silence 1 0.1 1%) same => n,AGI(speech_to_text.py,/tmp/${UNIQUEID}-1) same => n,Noop(The Intent : ${intent} EntityName : ${entity_name} EntityValue : ${entity_value})
same => n,System(rm /tmp/${UNIQUEID}.wav)
same => n,System(rm /tmp/${UNIQUEID}-1.wav)
same => n,Gotoif($[${LEN(${intent})}>0]?goto_intent:goto_again) same => n(goto_intent),Goto(${intent},s,1) same => n(goto_again),Goto(${CONTEXT},${exten},1)
exten => t,1,Noop(Time out happened...) same => n,System(rm /tmp/${UNIQUEID}.wav) same => n,Goto(${CONTEXT},${exten},1) same => n,Hangup()
[book_appointment] exten => s,1,Answer() same => n,MixMonitor(/tmp/${UNIQUEID}.wav,r) same => n,BackGroundDetect(surely_i_can_help-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand_general-8khz,1000,500) same => n,BackGroundDetect(silence/5,1000,500) same => n,BackGroundDetect(did_not_understand_general-8khz,1000,500) same => n,WaitExten(2)
my speech_to_text.py:
!/usr/bin/env python3
import sys import os import json import time import requests from asterisk.agi import * from websocket import create_connection from pydub import AudioSegment import base64 import constants as ct import traceback
AUDIO_FD = 3 CONTENT_TYPE = 'audio/l16; rate=8000; channels=1' ACCEPT = 'audio/pcm'
agi = AGI()
agi.verbose("Entering into Python AGI...") agi.answer() file_name = sys.argv[1]+'.wav'
agi.verbose('Recording FileName is %s' % file_name)
def process_chunk(agi, ws, buf): try: agi.verbose("Processing chunk") agi.verbose("Data type: " + str(type(buf))) # Debug line ws.send_binary(buf) time.sleep(0.1) # wait for 100 milliseconds response = ws.recv() while not response: # wait for a non-empty response response = ws.recv() result = json.loads(response) return result except Exception as e: agi.verbose('Error in process_chunk: ' + str(e)) return None
def from_file(file_name): try: ws = create_connection("ws://localhost:2700") ws.send('{ "config" : { "sample_rate" : 8000 } }') agi.verbose("Connection created") result = '' with open(file_name, 'rb') as f: while True: data = f.read(8000) if len(data) == 0: break result = process_chunk(agi, ws, data) ws.send("EOS") final_result = json.loads(ws.recv()) ws.close() if 'text' in final_result: return final_result['text'] else: return '' except Exception as e: agi.verbose('Error in from_file: ' + str(e)) return None
Classify intent and entity with Rasa query
def get_intent(data): try: import requests import json url = "http://"+ct.RASA_HOST+"/model/parse" intent = requests.post(url, data = '{"text":"'+data+'"}') intent = json.loads(intent.text) intent_name = intent['intent']['name'] if len(intent['entities'])>0: data = intent['entities'] entity_name = data[0]['entity'] entity_value = data[0]['value'] else: entity_name = 'none' entity_value = 'none' return intent_name+':'+entity_name+':'+entity_value except Exception as e: agi.verbose('Error in get_intent: ' + str(e)) return None
try: data = from_file(file_name) if data is not None: agi.verbose('The Detected Data: %s' %data) data = get_intent(data) if data is not None: data = data.split(':') intent_name = data[0] entity_name = data[1] entity_value = data[2] agi.set_variable('intent',intent_name) agi.set_variable('entity_name',entity_name) agi.set_variable('entity_value',entity_value) else: agi.verbose('Could not get intent from Rasa server') agi.set_variable('intent','default_intent') else: agi.verbose('Could not get data from Vosk server') agi.set_variable('intent','default_intent') except Exception as e: agi.verbose('Unexpected error: ' + str(e)) agi.set_variable('intent','default_intent')
my log console asterisk: speech_to_text.py,/tmp/1708808146.0-1: Entering into Python AGI... speech_to_text.py,/tmp/1708808146.0-1: Recording FileName is /tmp/1708808146.0-1.wav speech_to_text.py,/tmp/1708808146.0-1: Connection created speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Processing chunk speech_to_text.py,/tmp/1708808146.0-1: Data type: <class 'bytes'> speech_to_text.py,/tmp/1708808146.0-1: Error in from_file: Expecting value: line 1 column 1 (char 0) speech_to_text.py,/tmp/1708808146.0-1: Could not get data from Vosk server -- <PJSIP/200-00000000>AGI Script speech_to_text.py completed, returning 0 -- Executing [talk@outgoing:5] NoOp("PJSIP/200-00000000", "The Intent : default_intent EntityName : EntityValue : ") in new stack
ls -l /tmp total 268 -rw-r--r-- 1 asterisk asterisk 133826 Feb 24 20:55 1708808146.0-1.wav -rw-r--r-- 1 asterisk asterisk 135724 Feb 24 20:55 1708808146.0.wav