livekit / agents

Build real-time multimodal AI applications 🤖🎙️📹
https://docs.livekit.io/agents
Apache License 2.0
4.04k stars 421 forks source link

`interrupt_min_words` should only apply to `self._transcribed_text` #1117

Open seanmuirhead opened 2 days ago

seanmuirhead commented 2 days ago

Despite the fix to flush Deepgram transcripts, we are still (though rarely) seeing DG send only INTERIM transcripts without a FINAL.

If this is the case, then interrupt_min_words should be compared to self._transcribed_text, not self._transcribed_interim_text. Otherwise, we may interrupt the agents response without it ever being prompted to respond to what it was interrupted by. This results in the agent appearing to freeze

Proposed Solution 1

This line in VoicePipelineAgent:

text = self._transcribed_interim_text or self._transcribed_text

Should instead be:

text = self._transcribed_text

I would be happy to implement this if Livekit agrees The issue with this solution is that interruptions will appear to lag a bit from what the user actually says, so not sure if this is worth the tradeoff

Proposed Solution 2

Another solution could be having an internal timer of some sort that will use the _transcribed_interim_text if we never get back a FINAL event after a certain amount of time. I would not be as comfortable implementing this but can give it a try. I don't see a downside to this approach in terms of user experience