This pull request integrates Piper Text-to-Speech (TTS) as an internal TTS model for AAQ, supporting both English and Swahili languages.
Goal
The primary aim of this PR is to incorporate an in-house, open-source TTS model alongside the existing external Google Cloud models. This addition enhances AAQ's speech synthesis capabilities and reduces dependency on third-party services.
Changes
Introduced a new environment variable CUSTOM_TTS_ENDPOINT to facilitate the utilization of our in-house TTS model services.
Implemented a new optional TTS model within the optional_components/speech_api directory, expanding AAQ's speech synthesis options.
Added a dedicated Makefile target to create a separate aaq-speech environment, streamlining the process of running speech-related tests and working with speech services through a manual setup.
Future Tasks (optional)
Integration ofBhashini TTS for Indic languages, further expanding AAQ's multilingual capabilities.
How has this been tested?
Docker compose
Swagger UI
pytests
How to test this?
Ensure the CUSTOM_STT_ENDPOINT and CUSTOM_TTS_ENDPOINT environment variables are correctly set in your .core_backend.env file.
Initialize the Docker containers using the command: docker compose -f docker-compose.yml -f docker-compose.dev.yml -f docker-compose.speech.yml -p aaq-stack watch.
Utilize the /voice-search endpoint and inspect the generated URL for the TTS speech file to confirm proper functionality.
Checklist
Fill with x for completed.
[x] My code follows the style guidelines of this project
[x] I have reviewed my own code to ensure good quality
[x] I have tested the functionality of my code to ensure it works as intended
[x] I have resolved merge conflicts
[x] I have updated the automated tests
[x] I have updated the requirements
[x] I have updated the CI/CD scripts in .github/workflows/
Reviewer: @lickem22 Estimate: 30 mins
Ticket
Fixes: JIRA_TICKET_LINK
Description
This pull request integrates
Piper
Text-to-Speech (TTS) as aninternal TTS
model for AAQ, supporting both English and Swahili languages.Goal
The primary aim of this PR is to incorporate an in-house, open-source TTS model alongside the existing external Google Cloud models. This addition enhances AAQ's speech synthesis capabilities and reduces dependency on third-party services.
Changes
CUSTOM_TTS_ENDPOINT
to facilitate the utilization of our in-house TTS model services.optional_components/speech_api
directory, expanding AAQ's speech synthesis options.aaq-speech
environment, streamlining the process of running speech-related tests and working with speech services through a manual setup.Future Tasks (optional)
Bhashini TTS
for Indic languages, further expanding AAQ's multilingual capabilities.How has this been tested?
Docker compose Swagger UI pytests
How to test this?
CUSTOM_STT_ENDPOINT
andCUSTOM_TTS_ENDPOINT
environment variables are correctly set in your.core_backend.env
file.docker compose -f docker-compose.yml -f docker-compose.dev.yml -f docker-compose.speech.yml -p aaq-stack watch.
/voice-search
endpoint and inspect the generated URL for the TTS speech file to confirm proper functionality.Checklist
Fill with
x
for completed..github/workflows/