livekit / agents

Build real-time multimodal AI applications 🤖🎙️📹
https://docs.livekit.io/agents
Apache License 2.0
4.01k stars 415 forks source link

elevenlabs-plugin: "cloned" + "professional" voices not working #286

Closed andrewjhogue closed 3 months ago

andrewjhogue commented 6 months ago

With the minimal_assistant.py example, the above categories of voices don't seem to generate output properly. Instead it seems to hang for long periods of time.

Am seeing LLM chat completion requests completing successfully, which seems to suggest that STT and OpenAI are working, but no audio output.

I'm using a Macbook with M1 / Ventura 13.0.1, running Python 3.12.

elevenlabs-plugin is working correctly with "premade" voices, and also with OpenAI's TTS (though it doesn't support streaming)

Examples of non-working code:

Voice = elevenlabs.Voice(
    id=MY_VOICE_ID,
    name="Voice Name",
    category="professional",
    settings=elevenlabs.VoiceSettings(
        stability=0.60, similarity_boost=1.0
    )
)

assistant = VoiceAssistant(
  vad=silero.VAD(),
  stt=deepgram.STT(),
  llm=openai.LLM(),
  tts=elevenlabs.TTS(voice=Voice),
  chat_ctx=initial_ctx,
)
assistant.start(ctx.room)

As well as:

Voice = elevenlabs.Voice(
    id=MY_VOICE_ID,
    name="Voice Name",
    category="cloned",
    settings=elevenlabs.VoiceSettings(
        stability=0.60, similarity_boost=1.0
    )
)

assistant = VoiceAssistant(
  vad=silero.VAD(),
  stt=deepgram.STT(),
  llm=openai.LLM(),
  tts=elevenlabs.TTS(voice=Voice),
  chat_ctx=initial_ctx,
)
assistant.start(ctx.room)

What I'm seeing in logs from agent running locally:

2024-05-14 01:56:21,398 INFO  httpx  HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"      job_id=AJ_oiddbfaSEWxh pid=56250
2024-05-14 01:56:22,141 INFO  httpx  HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"      job_id=AJ_oiddbfaSEWxh pid=56250
2024-05-14 01:56:23,720 ERROR  livekit.plugins.elevenlabs  11labs connection failed
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/livekit/plugins/elevenlabs/tts.py", line 365, in _run_ws
    await asyncio.gather(send_task(), recv_task())
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/livekit/plugins/elevenlabs/tts.py", line 340, in recv_task
    raise Exception("11labs connection closed unexpectedly")
Exception: 11labs connection closed unexpectedly
      job_id=AJ_oiddbfaSEWxh pid=56250
2024-05-14 01:56:32,020 INFO  httpx  HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"      job_id=AJ_oiddbfaSEWxh pid=56250
2024-05-14 01:56:33,258 ERROR  livekit.plugins.elevenlabs  11labs connection failed
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/livekit/plugins/elevenlabs/tts.py", line 365, in _run_ws
    await asyncio.gather(send_task(), recv_task())
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/livekit/plugins/elevenlabs/tts.py", line 340, in recv_task
    raise Exception("11labs connection closed unexpectedly")
Exception: 11labs connection closed unexpectedly
      job_id=AJ_oiddbfaSEWxh pid=56250

Happy to share more context as needed - awesome project!

keepingitneil commented 6 months ago

Which price tier of 11 labs are you using? Looks like they changed things up an PCM is only supported for pro tier: https://elevenlabs.io/docs/api-reference/streaming

andrewjhogue commented 6 months ago

@keepingitneil thanks for the follow-up - I'm on 11 Labs pro tier.

I've been able to get this working - but currently all of the VoiceSettings fields need to be set for it to work properly.

Non-working configuration:

Voice = elevenlabs.Voice(
    id=MY_VOICE_ID,
    name="Voice Name",
    category="professional",
    settings=elevenlabs.VoiceSettings(
        stability=0.60,
        similarity_boost=1.0
    )
)

Working configuration:

Voice = elevenlabs.Voice(
    id=MY_VOICE_ID,
    name="Voice Name",
    category="professional",
    settings=elevenlabs.VoiceSettings(
        stability=0.60,
        similarity_boost=1.0,
        style=0.1,
        use_speaker_boost=True
    )
)
keepingitneil commented 3 months ago

Thanks for investigating and finding the cause. Closing this issue and adding a ticket on our end to make sure we have either sane defaults/or required params for this path.