livekit / agents

Build real-time multimodal AI applications 🤖🎙️📹
https://docs.livekit.io/agents
Apache License 2.0
3.45k stars 338 forks source link

Sometimes VAD Freezes/Unable to detect speech end #835

Open hari01584 opened 1 week ago

hari01584 commented 1 week ago

We are encountering a very weird bug where sometimes in conversation, VAD/Agent cannot detect end of speech.. the usual flow is that user starts speaking and we can get live transcripts for it but when they stop there is no OpenAI call or subsequent agent response (even after waiting for long time), the solution here is to add on something more (user speaks one or two more words and then again stop), then it detects the whole speech segment and now agent replies back.

Very peculiar but annoying bug, Please fix

martin-purplefish commented 1 week ago

We're experiencing this too.

martin-purplefish commented 1 week ago

it's strange that I still get the user finished speaking events, but the message never comes.

hari01584 commented 1 week ago

Indeed, my team is creating automated interviewing solutions and we are rigorously using live kit, there is few bugs popping up but overall this is amazing

theomonnom commented 1 week ago

Hey, I just took a look at the VAD implementation and I'm quite sure it isn't coming from it.

So I think something froze inside the VoicePipelineAgent?

Can you share the debug logs of your agent when this is happening? It would be useful to track where the pipeline hangs.

martin-purplefish commented 1 week ago

@theomonnom I'll get some debug logs, but here are some info level ones:

{"message": "Agent state thinking -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:32.386400+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.647632+00:00"}
{"message": "logging agent message to backend", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.648797+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977859+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_QzsLJZ9CtTo3, identity=agent-AJ_6Z4b4mTfh6eR, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977885+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977901+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977914+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:39.767192+00:00"}
{"message": "User stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:40.309368+00:00"}
{"message": "Silence timeout", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:46.082361+00:00"}
{"message": "Agent started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:53.329993+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:53.351021+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_QzsLJZ9CtTo3, identity=agent-AJ_6Z4b4mTfh6eR, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:53.351101+00:00"}

The user started speaking/agent started speaking are the event handlers (assistant.on('user_started_speaking') etc)

The silence timeout is a 5 second timeout I have that gets called after 5 seconds of silence. After the agent's silence timeout (note it says user stopped speaking), the agent still didn't know about the user's message.

In the agent playground, the message still appears with an ellipses and when I keep talking it completes it above the agent's response (so I see something like:

Me: Hello...
Agent: Sorry are you there?

(from my timeout logic)

Then I keep talking:

Me: Hello. Yes I'm here.
Agent: Sorry are you there?

Then finally the agent gets the full response:

Agent: Sorry are you there?
Me: Hello. Yes I'm here.
martin-purplefish commented 1 week ago

@theomonnom here's another one:

{"message": "synthesizing agent reply", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_transcript": "school graduate.", "speech_id": "312d2586ff54", "elapsed": 2.702, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:22.675932+00:00"}
{"message": "first LLM token", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "elapsed": 0.443, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:23.077241+00:00"}
{"message": "first LLM token", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "elapsed": 0.498, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:23.174693+00:00"}
{"message": "first TTS frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "elapsed": 1.097, "streamed": true, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.176101+00:00"}
{"message": "started playing the first frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.176384+00:00"}
{"message": "Agent started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.176455+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196528+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196603+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196625+00:00"}
{"message": "Agent state thinking -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196641+00:00"}
{"message": "first TTS frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "elapsed": 1.11, "streamed": true, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.284736+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:28.817733+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "interrupted": false, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:28.820274+00:00"}
{"message": "committed agent speech", "level": "DEBUG", "name": "livekit.agents.pipeline", "agent_transcript": " <agent transcirpt>", "interrupted": false, "speech_id": "bec352999b1a", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056301+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056529+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056566+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056587+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056601+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:30.661539+00:00"}
{"message": "User stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:31.241238+00:00"}
{"message": "validated agent reply", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.787292+00:00"}
{"message": "committed user transcript", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_transcript": "<commited transcript>", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.787785+00:00"}
{"message": "started playing the first frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.998715+00:00"}
{"message": "Agent started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.998815+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020321+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020402+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020434+00:00"}
{"message": "Agent state listening -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020458+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.759998+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "interrupted": false, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.762560+00:00"}
{"message": "committed agent speech", "level": "DEBUG", "name": "livekit.agents.pipeline", "agent_transcript": "<committed transcript>", "interrupted": false, "speech_id": "312d2586ff54", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940394+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940600+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940630+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940649+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940663+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:39.211851+00:00"}
{"message": "User stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:39.813066+00:00"}
{"message": "validated agent reply", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "456a8f63bf42", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:41.086804+00:00"}
martin-purplefish commented 1 week ago

https://github.com/livekit/agents/blob/ab0f446cb0eebd770f820dade1e168cad1503fdd/livekit-agents/livekit/agents/pipeline/pipeline_agent.py#L911

Could it be here? I feel like that 1.0 should be delay (just poking around)

martin-purplefish commented 1 week ago

I added a bunch of logging - it seems to be getting stuck in _commit_user_question_if_needed. I think because it commits before this check?

{"message": "Committing user transcript", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_transcript": "I got a car.", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:19.326403+00:00"}
{"message": "Agent state listening -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:19.534668+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.014278+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.014572+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.515413+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.515586+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.016198+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.016333+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.517238+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.517434+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.017888+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.018884+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.519314+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.519444+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.020008+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.020445+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.520987+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.521194+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.021886+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.022096+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.522955+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.523102+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.024159+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.024238+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.525787+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.525994+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.026618+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.026950+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.527442+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.527633+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.675035+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "6b4f4f352660", "interrupted": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.675941+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676034+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676068+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676086+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676145+00:00"}
{"message": "https://app.purplefish.com:443 \"POST /api/engagements/cm1vaq4c5005qta4wtb4gnssw/messages HTTP/11\" 201 None", "level": "DEBUG", "name": "urllib3.connectionpool", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.856574+00:00"}
{"message": "committed agent speech", "level": "DEBUG", "name": "livekit.agents.pipeline", "agent_transcript": "<agent speech>", "interrupted": false, "speech_id": "6b4f4f352660", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.856988+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857175+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_mSQQdiSUFyHr, identity=agent-AJ_Wk6PhiMXZFnv, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857210+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857237+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857250+00:00"}
{"message": "Silence timeout", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.208893+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.210598+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.210675+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.711407+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.711620+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.987479+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.212177+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.212252+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "045a914759a2", "interrupted": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401319+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401407+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401432+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401446+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401457+00:00"}
theomonnom commented 1 week ago

mmh interesting, tho the playout should have been started before the user question is committed. So you should have at least heard an answer

martin-purplefish commented 1 week ago

I think it has something to do with interruption handling. It seems more likely if I start talking right as the agent is about to finish.

martin-purplefish commented 1 week ago

It seems to be when there are two user messages being committed - the first one gets committed, but the second gets stuck. Check out this (they're reversed so read from the bottom upl

So the "Yeah." which I said first gets committed, but the "Sure that works." hangs until I reply again. And then it gets appended to the outstanding one. So I think it's... only one message gets committed at a time?

Could it be something like having multiple outstanding human transcripts to submit, and maybe the agent starts talking?

{
  "message": "User question not committed: conditions not met",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "user_question": "Yeah.",
  "interrupted": false,
  "user_committed": true,
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:06.116402+00:00"
}
{
  "message": "User question committed successfully",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:05.114075+00:00"
}
{
  "message": "Committing user transcript",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "user_transcript": "Yeah.",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.950239+00:00"
}
{
  "message": "Checking if user question needs to be committed",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "user_question": "Yeah.",
  "interrupted": false,
  "user_committed": false,
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.950128+00:00"
}
{
  "message": "validated agent reply",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "speech_id": "6279ab903224",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.949863+00:00"
}
{
  "message": "Received end of speech, validating reply",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "transcript": "Sure that works.",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.573735+00:00"
}
{
  "message": "User stopped speaking",
  "level": "INFO",
  "name": "livekit.agents.pipeline",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.573610+00:00"
}
{
  "message": "Not validating reply: human is still speaking",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "transcript": "Sure that works.",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.511658+00:00"
}
{
  "message": "User started speaking",
  "level": "INFO",
  "name": "livekit.agents.pipeline",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:02.492610+00:00"
}
martin-purplefish commented 1 week ago

Another hypothesis:

  1. The AI synthesis kicks off but latency is high so it takes a while to get the first token.
  2. The user talks and finishes before the agent speaks.
  3. The agent speaks (in response to the previous message, having not seen what the user said). The user's message in 2 is "stuck open".
  4. The user repeats themselves, and that causes their outstanding message to get committed.
hari01584 commented 1 week ago

@theomonnom As per my debugging, this is difficult to reproduce but a common pattern/issue I think is due to STT (I am using deepgram)

I think it happens due to un-certainty of text to be transcribed, I was able to somewhat consistently reproduce this by saying a confusing word then making loud noises immediately, then my existing sentences is shown as " ...", it keeps being there until I say something else

probably STT being uncertain what next to come? One pattern also.. You can say exact different sentence and your existing word/sentence (before being stuck), can be replaced/changed with better now-spoken version

martin-purplefish commented 1 week ago

@theomonnom what seems to happen is that _play_speech is waiting with an empty transcript, while there's an actual transcript behind it. so until the user says something that causes the empty transcript to get committed, it's waiting in the _speech_q

theomonnom commented 1 week ago

This is weird, it basically means that Deepgram is never sending final transcripts. Only interim transcripts

hari01584 commented 1 week ago

That would make a lot of sense, I will try changing providers to Azure maybe and report If it fixes anything