livekit / agents

Build real-time multimodal AI applications 🤖🎙️📹
https://docs.livekit.io/agents
Apache License 2.0
1.16k stars 229 forks source link

STT from Deepgram: How to enable formatting in the runtime? #382

Open naman-scogo opened 3 months ago

naman-scogo commented 3 months ago

I am trying to register users' phone number and email ID using the voice assistant. However, when the user starts speaking their phone number(1234567890), the stt service returns one two three four 5:06 seven eight nine zero. Which then gets rejected by the LLM if we have a 10-digit validation for phone number in place. Now there are methods available to ask the Deepgram STT service to format the speech as per our requirement on the run time. So, right before the user is prompted to give the phone number we can trigger it on the run time, refer here. Although, I am not able to figure out how to send this event to the initialized STT service. Would be helpful if someone can provide a way or solution for it.

theomonnom commented 3 months ago

Hey, deepgram has a formatting option we don't expose yet. I'll try to make a PR next week to add it.

Maybe you can make a PR? It is as easy as adding a Boolean option inside the deepgram constructor

theomonnom commented 3 months ago

Ah ok this is a toggling option, mmh I'm wondering how this could be exposed with the VoiceAssistant as the STT stream is currently internal