A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models. Supports OpenAI, Groq, Elevanlabs, CartesiaAI, and Deepgram APIs, plus local models via Ollama. Ideal for research and development in voice technology.
To add support for MeloTTS, this PR implements a fastapi server (local_tts_api.py) to listen to user inputs (text) and writes an audio file into disk. It also adds a helper function in local_tts_generation.py which will call the api server.
You will need to install MeloTTS for this to work.
This PR adds local TTS (MeloTTS) support.
To add support for MeloTTS, this PR implements a fastapi server (
local_tts_api.py
) to listen to user inputs (text) and writes an audio file into disk. It also adds a helper function inlocal_tts_generation.py
which will call the api server.You will need to install MeloTTS for this to work.