pipecat-ai / pipecat

Open Source framework for voice and multimodal conversational AI
BSD 2-Clause "Simplified" License
3.12k stars 273 forks source link

[Feature request] Groq transcriptions support #254

Open gaceladri opened 3 months ago

gaceladri commented 3 months ago

Here are the docs for the new Whisper models available in Groq Cloud. I'd like to replace my Deepgram STT provider with the Groq transcription service. It would be great if you could add this integration. 🙏

aconchillo commented 3 months ago

Here are the docs for the new Whisper models available in Groq Cloud. I'd like to replace my Deepgram STT provider with the Groq transcription service. It would be great if you could add this integration. 🙏

Is this real-time?

gaceladri commented 3 months ago

Great question! Let's compare Deepgram and Groq in terms of speed and cost:

Speed: According to recent benchmarks, Deepgram still appears to be the fastest option for speech-to-text processing, despite Groq's impressive performance in other areas.

Cost:

This makes Groq approximately 8.6 times cheaper than Deepgram for the same amount of processing time.

While Groq offers a significant cost advantage, Deepgram maintains its edge in processing speed. The choice between the two may depend on your specific use case:

Considering these benchmarks, I'd recommend maintaining Deepgram as our primary speech-to-text (STT) provider. Its superior speed continues to outweigh the cost savings offered by Groq. At least for my app.

gaceladri commented 3 months ago

Feel free to close the issue if you don't want to put this on your priority list. I'd prioritize the function calling with Gemini more, as that would bring much greater benefits by enabling us to use Gemini 1.5 Flash instead of GPT-4o.

skelleex commented 1 month ago

Would also like to have Groq LLMs and Whisper STT from Groq implemented.

joachimchauvet commented 1 month ago

@skelleex It is already possible to use LLMs running on Groq with the OpenAILLMService

llm = OpenAILLMService(
    name="LLM", api_key=GROQ_API_KEY, model=model_name, base_url="https://api.groq.com/openai/v1"
)

This feature request is mostly related to OpenAI translation API for cloud-hosted whisper STT.