livekit / agents

Build real-time multimodal AI applications 🤖🎙️📹
https://docs.livekit.io/agents
Apache License 2.0
3.87k stars 387 forks source link

Azure TTS Crash #729

Closed Test-isom closed 1 week ago

Test-isom commented 1 month ago

{"message": "Error in main_task\nTraceback (most recent call last):\n File \"/root/pythonenv/enve/lib/python3.10/site-packages/livekit/agents/utils/log.py\", line 16, in async_fn_logs\n return await fn(*args, *kwargs)\n File \"/root/pythonenv/enve/lib/python3.10/site-packages/livekit/plugins/azure/tts.py\", line 90, in main_task\n raise ValueError(\nValueError: failed to synthesize audio: ResultReason.Canceled <azure.cognitiveservices.speech.SpeechSynthesisCancellationDetails object at 0x7d1d401cee90>", "level": "ERROR", "pid": 772600, "job_id": "AJ_NtiqPibwJ4no", "timestamp": "2024-09-09T09:52:06.540702+00:00"} {"message": "Error in str_synthesis_task\nTraceback (most recent call last):\n File \"/root/pythonenv/enve/lib/python3.10/site-packages/livekit/agents/utils/log.py\", line 16, in async_fn_logs\n return await fn(args, **kwargs)\n File \"/root/pythonenv/enve/lib/python3.10/site-packages/livekit/agents/voice_assistant/agent_output.py\", line 222, in str_synthesis_task\n handle.tts_forwarder.mark_audio_segment_end()\n File \"/root/pythonenv/enve/lib/python3.10/site-packages/livekit/agents/transcription/tts_forwarder.py\", line 185, in mark_audio_segment_end\n raise RuntimeError(\"mark_audio_segment_end called before any push_audio\")\nRuntimeError: mark_audio_segment_end called before any push_audio", "level": "ERROR", "pid": 772600, "job_id": "AJ_NtiqPibwJ4no", "timestamp": "2024-09-09T09:52:06.541356+00:00"}

AphinityApp commented 1 month ago

Seeing the same issue. Likely an API change on their (azure's) end - it started happening yesterday with no code changes. Other TTS providers still working fine.

davidzhao commented 1 month ago

I think this is when the Azure TTS service did not return any audio. We have a fix in #730 that prevents the exception in that case.. however, the root cause it still the fact that Azure TTS didn't return any audio frames.