PromtEngineer / Verbi

A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models. Supports OpenAI, Groq, Elevanlabs, CartesiaAI, and Deepgram APIs, plus local models via Ollama. Ideal for research and development in voice technology.
MIT License
123 stars 35 forks source link

Errors on "TRANSCRIPTION_MODEL" #2

Closed mercuryyy closed 1 month ago

mercuryyy commented 1 month ago

With TRANSCRIPTION_MODEL = 'deepgram' I am getting:

pygame 2.5.2 (SDL 2.28.3, Python 3.11.6) Hello from the pygame community. https://www.pygame.org/contribute.html 2024-05-20 14:38:03,651 - INFO - Recording started 2024-05-20 14:38:09,461 - INFO - Recording complete 2024-05-20 14:38:09,592 - ERROR - An error occurred: can only concatenate str (not "NoneType") to str 2024-05-20 14:38:09,592 - INFO - Deleted file: test.wav 2024-05-20 14:38:10,733 - INFO - Recording started 2024-05-20 14:38:16,905 - INFO - Recording complete 2024-05-20 14:38:17,038 - ERROR - An error occurred: can only concatenate str (not "NoneType") to str 2024-05-20 14:38:17,039 - INFO - Deleted file: test.wav 2024-05-20 14:38:18,176 - INFO - Recording started

ANd with TRANSCRIPTION_MODEL = 'groq'

pygame 2.5.2 (SDL 2.28.3, Python 3.11.6) Hello from the pygame community. https://www.pygame.org/contribute.html 2024-05-20 14:37:08,207 - INFO - Recording started 2024-05-20 14:37:13,265 - INFO - Recording complete 2024-05-20 14:37:13,432 - ERROR - Failed to transcribe audio: 'Groq' object has no attribute 'audio' 2024-05-20 14:37:13,432 - INFO - You said: Error in transcribing audio 2024-05-20 14:37:13,997 - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK" 2024-05-20 14:37:14,006 - INFO - Response: Sorry to hear that! If you're experiencing issues transcribing audio, can you please provide more context or details about the error, such as:

mercuryyy commented 1 month ago

groq is due to:

2024-05-20 14:48:41,398 - ERROR - Failed to transcribe audio: Error code: 404 - {'error': {'message': 'The model whisper-large-v3 does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'code': 'model_not_found'}}

Which i have no idea how to get access to whisper via groq i see nothing about it on their website.

But the deepgram error i have no idea why it is happening since the deepgram tts is working fine.

3choff commented 1 month ago

Groq's transcription model is only accessible in private beta, and Deepgram is a placeholder for a future implementation, this is why they do not work. At the moment, for the transcription model, your only choices are OpenAI Whisper or my local implementation using FastWhisperAPI. The API can run locally, in a Docker container, and even in a Google Colab if you do not want to install anything. Here is the link to the branch of the project that uses FastWhisperAPI: https://github.com/3choff/Verbi.git,and here is the link to the repository of FastWhisperAPI: https://github.com/3choff/FastWhisperAPI.git . Keep in mind that if you decide to use Google Colab, you will have to replace the localhost address in transcription.py with the one provided by NGROQ when you run the last cell in the notebook. I hope this helps.

PromtEngineer commented 1 month ago

@mercuryyy as @3choff pointed out Groq STT is currently gated and support for Deepgram STT will be added soon.

PromtEngineer commented 1 month ago

@mercuryyy add the deepgram transcription model in #3