Add support for FastWhisperAPI running locally in Docker

3choff commented 1 month ago

Description

This pull request updates the transcription.py module to add support for FastWhisperAPI running locally in a Docker container. It also includes some other minor improvements:

Sets an exit word to terminate the main loop.
Adds a language model parameter to improve transcription. This helps when the user has a foreign accent.
Sets energy_threshold and pause_threshold to allow the user a longer prompt while maintaining pause detection in the speech.

Changes Made

FastWhisperAPI Support:

Added functionality to the transcription.py module to support FastWhisperAPI running locally in a Docker container.
Incorporated Docker-specific configurations and environment variables to ensure seamless integration with FastWhisperAPI.

Energy and Pause Threshold Adjustments:

Adjusted the energy_threshold and pause_threshold parameters in the codebase to allow users a longer prompt while maintaining accurate pause detection in the speech.

Minor Improvements:

Implemented an exit word to terminate the main loop.
Added a language model parameter to transcription.py to improve transcription accuracy, particularly beneficial for users with foreign accents.
Made small edits to the shell commands of the README.md.
Added a reset color at the end of the logging.info to ensure proper visualization in the terminal.

PromtEngineer commented 1 month ago

@3choff can you integrate the latest changes and update the readme with instructions on how to use FastWhisperAPI with the code.

3choff commented 1 month ago

@PromtEngineer I have integrated the latest changes and updated the README. I'm not sure why it says there are conflicts to resolve, as the code difference adds the local API. Let me know if there is something else I could work on; I am happy to contribute.

PromtEngineer commented 1 month ago

@3choff I am running into the following error

zsh: segmentation fault python run_voice_assistant.py

I suspect it has to do with the dependencies but haven't really figured out which one is causing the issues.

3choff commented 1 month ago

That is odd. I am not having this issue. Are you running FastWhisperAPI in a Docker container or on your local machine? Which version of Python is your environment using? I am on 3.10.14. I have been testing the API with Verbi in a Docker container and Colab. I will test it fully locally and let you know.

3choff commented 1 month ago

After further testing the API on the local machine, I found a conflict between a file version used by both Torch and CTranslate2. I found a workaround and asked on Discord if someone else could test it. Let's see what the testing feedback is.

PromtEngineer commented 1 month ago

@3choff I found that on my local setup, the keyboard package has a conflict. I disabled that functionality in audio.py and was able to run it without any issues. Let's do it stepwise, first will bring in your PR without the interruption functionality and then we can figure out which version works best.

3choff commented 1 month ago

@PromtEngineer That sounds good to me.

PromtEngineer / Verbi