Hey Victoria is an experimental English-understanding speech recognition assistant that connects to a TeamSpeak 3 channel. She is controlled entirely through speech.
Examples of commands that Victoria can currently understand include:
The project is currently in a proof-of-concept state and is rough around the edges.
In order to record what is spoken, a TeamSpeak plugin is currently used due to the lack of a library to connect to a TeamSpeak server.
Each user's voice data is sent to a listening server that performs the necessary speech recognition.
Currently the client needs to run on the same system and user account as the TeamSpeak client. In addition, the default audio output device must be set as the default capture device in TeamSpeak. Some of Victoria's components currently require Microsoft Windows.
Python libraries:
pip install speechrecognition
)pip install pyttsx
)pip install textblob
)pip install google-api-python-client
)Supporting software:
Data:
python -m textblob.download_corpora
)API keys:
Everything should be run on the same user account in Windows, and TeamSpeak should be configured to capture the output of the default audio output device.
The Voice Copy plugin is the TeamSpeak plugin component.
By default, the voice copy plugin is configured to send voice data to port 32000 at 127.0.0.1. To adjust this, change plugin.c appropriately.
Inside the listen_server/ folder:
Create a config.ini file and in it, place:
[server]
host=127.0.0.1
port=32000
[youtube]
apiKey=
Configure the values and enter your YouTube API key.
Run listen.py with the path to the configuration file: python listen.py config.ini
On initial start, something should be said over text to speech and the beep sounds should be heard.
Victoria works best in a channel set to the Opus Music audio quality setting. Other codecs significantly degrade the ability for the assistant to detect the key phrase.
If the key phrase ("Victoria") is heard, a beep sound should be heard. A command must be then said afterwards, taking into consideration that sentences are recognized better than single word commands. However, ultimately Victoria is looking for a specific word to decide what to do.
Once the speaker has finished talking, Victoria will sound another beep a second or two after silence had started. Victoria will also eventually stop listening if the speaker does not seem to stop speaking.
The first invocation of the speech recognition engine may have very poor results. Try again a second time.
Commands currently include:
The flow of interaction is:
If the command portion is not recognized or an unknown command is mentioned, then Victoria will say so using text-to-speech.
Hey Victoria is licensed under GNU Lesser General Public License v3.
The sounds are sourced from: