PromtEngineer / Verbi

A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models. Supports OpenAI, Groq, Elevanlabs, CartesiaAI, and Deepgram APIs, plus local models via Ollama. Ideal for research and development in voice technology.
MIT License
123 stars 35 forks source link

Add support for FastWhisperAPI running locally in Docker #1

Closed 3choff closed 1 month ago

3choff commented 1 month ago

Description

This pull request updates the transcription.py module to add support for FastWhisperAPI running locally in a Docker container. It also includes some other minor improvements:

Changes Made

FastWhisperAPI Support:

Energy and Pause Threshold Adjustments:

Minor Improvements:

PromtEngineer commented 1 month ago

@3choff can you integrate the latest changes and update the readme with instructions on how to use FastWhisperAPI with the code.

3choff commented 1 month ago

@PromtEngineer I have integrated the latest changes and updated the README. I'm not sure why it says there are conflicts to resolve, as the code difference adds the local API. Let me know if there is something else I could work on; I am happy to contribute.

PromtEngineer commented 1 month ago

@3choff I am running into the following error

zsh: segmentation fault python run_voice_assistant.py

I suspect it has to do with the dependencies but haven't really figured out which one is causing the issues.

3choff commented 1 month ago

That is odd. I am not having this issue. Are you running FastWhisperAPI in a Docker container or on your local machine? Which version of Python is your environment using? I am on 3.10.14. I have been testing the API with Verbi in a Docker container and Colab. I will test it fully locally and let you know.

3choff commented 1 month ago

After further testing the API on the local machine, I found a conflict between a file version used by both Torch and CTranslate2. I found a workaround and asked on Discord if someone else could test it. Let's see what the testing feedback is.

PromtEngineer commented 1 month ago

@3choff I found that on my local setup, the keyboard package has a conflict. I disabled that functionality in audio.py and was able to run it without any issues. Let's do it stepwise, first will bring in your PR without the interruption functionality and then we can figure out which version works best.

3choff commented 1 month ago

@PromtEngineer That sounds good to me.