Open willsmanley opened 1 month ago
Hey, it is because the LLMStream is created manually when using the chat messages: See https://github.com/livekit/agents/blob/fe4471aa147346d4357c542b93917605c6700750/examples/voice-assistant/minimal_assistant.py#L63
that makes sense, but the result of that decision is that the assistant has no memory of recent text messages as shown in this example with kitt.
if chat conversational memory is not supported in the same way as voice conversational memory, it seems like that chat option shouldn't be supported
it is a problem that text messages are missing from convo history. I think we should standardize on VoiceAssistant handling Chat automatically once the new Chat protocol makes it in. wdyt @theomonnom @lukasIO @bcherry ?
one quick fix is to not copy the chat context and just append the message to the true context. you'd also have to make sure that you manually invoke before_llm_cb
on the application side.
# before
async def answer_from_text(txt: str):
chat_ctx = assistant.chat_ctx.copy()
chat_ctx.append(role="user", text=txt)
stream = assistant.llm.chat(chat_ctx=chat_ctx)
await assistant.say(stream)
# after
async def answer_from_text(txt: str):
assistant.chat_ctx.append(role="user", text=txt)
stream = await before_llm_cb(assistant, assistant.chat_ctx)
await assistant.say(stream)
however this still does not trigger function calling or interruptions and requires a minor abstraction leak. so it would be ideal to have chat mode more natively supported in the same way voice is.
it would also be really neat to have the option to disable voice synthesis for use in pure chat mode (aggressively stream transcripts instead of waiting for voice synthesis timings and save on voice synthesis usage). I opened a separate issue for this request since it is related but different scope: https://github.com/livekit/agents/issues/791
before_llm_cb
is only called when there is an audio message from the userIf there is a text message,
before_llm_cb
is not called.It seems like, for consistency purposes, this method should also be called for text messages.