before_llm_cb is not triggered for text messages from user

willsmanley commented 1 month ago

before_llm_cb is only called when there is an audio message from the user

If there is a text message, before_llm_cb is not called.

It seems like, for consistency purposes, this method should also be called for text messages.

theomonnom commented 1 month ago

Hey, it is because the LLMStream is created manually when using the chat messages: See https://github.com/livekit/agents/blob/fe4471aa147346d4357c542b93917605c6700750/examples/voice-assistant/minimal_assistant.py#L63

willsmanley commented 1 month ago

that makes sense, but the result of that decision is that the assistant has no memory of recent text messages as shown in this example with kitt.

if chat conversational memory is not supported in the same way as voice conversational memory, it seems like that chat option shouldn't be supported

davidzhao commented 1 month ago

it is a problem that text messages are missing from convo history. I think we should standardize on VoiceAssistant handling Chat automatically once the new Chat protocol makes it in. wdyt @theomonnom @lukasIO @bcherry ?

willsmanley commented 1 month ago

one quick fix is to not copy the chat context and just append the message to the true context. you'd also have to make sure that you manually invoke before_llm_cb on the application side.

# before
async def answer_from_text(txt: str):
     chat_ctx = assistant.chat_ctx.copy()
     chat_ctx.append(role="user", text=txt)
     stream = assistant.llm.chat(chat_ctx=chat_ctx)
     await assistant.say(stream)

# after
async def answer_from_text(txt: str):
     assistant.chat_ctx.append(role="user", text=txt)
     stream = await before_llm_cb(assistant, assistant.chat_ctx)
     await assistant.say(stream)

however this still does not trigger function calling or interruptions and requires a minor abstraction leak. so it would be ideal to have chat mode more natively supported in the same way voice is.

it would also be really neat to have the option to disable voice synthesis for use in pure chat mode (aggressively stream transcripts instead of waiting for voice synthesis timings and save on voice synthesis usage). I opened a separate issue for this request since it is related but different scope: https://github.com/livekit/agents/issues/791

livekit / agents

before_llm_cb is not triggered for text messages from user #783