MatthewSloyan / final-year-applied-project-and-minor-dissertation

Conflict resolution application built in Unity for Virtual Reality. Azure speech services are used along with an AIML chatbot, a flask server and a MongoDB database to provide a cost-effective and realistic approach to training.
1 stars 1 forks source link

Text To Speech #1

Closed MatthewSloyan closed 4 years ago

MatthewSloyan commented 4 years ago

I initially researched all the different Text to speech providers, which include IBM Watson, Google Cloud Text to Speech and Microsoft Azure Text to Speech. IBM has a unity package but only allows for 10000 free characters a month which won't meet our needs and it can be quite robotic. Google's Text to speech API provides realistic sounding voices with pitch, tone and speech rate functionality with a free starter package. Lastly, Azure Text to Speech provides the same functionality and costs but can sound slightly more robotic from testing.

I wanted to try using Google's Text to speech API first. From this, I tried to implement it into the Unity application by using the three required NuGet packages. I also had to set up unity to allow these. However, despite trying multiple solutions the packages aren't compatible with newer versions of Unity. I could implement it through the Flask server we have running but then an audio file would have to be passed back to the application which could slow down the requests.

From additional research, I found that Azure's text to speech included a Unity package which would make it easier to implement. However, I had problems installing it which I fixed by loading the project in the root of the C drive. Using the documentation provided I implemented a function which takes in input returned from the server, converts it to 16bit audio and then plays the audio.

With this implemented, we now have the basics of our project completed. The user can chat to the bot using their own voice which responds with the most likely answer using neural networks and outputs the response using a realistic voice.

MatthewSloyan commented 4 years ago

Text to Speech was successfully implemented using Azure services. A full description of the implementation can be found in the System Design chapter of our Dissertation.