Open agilebean opened 3 months ago
The LLMMessagesFrame
doesn't know or care what's inside its messages
list of dict. How are you using the LangchainProcessor
, what does the chain look like?
Thanks for asking. My chain is really normal in the sense that I didn't yet try the extension of config fields as follows.
exclude_metadata = filter_messages(include_types=[HumanMessage, AIMessage, SystemMessage])
prompted_chat_model = constants.PROMPT_TEMPLATE | exclude_metadata | chat_model
...
chain_runnable = RunnableWithMessageHistory(
prompted_chat_model,
lambda session_id: get_session_history(database_label, session_id, message_store),
history_messages_key="chat_history",
input_messages_key="input",
**memory_kwargs,
)
I feed the system message as:
system_message = [
{
"role": "system",
"content": SYSTEM_PROMPT
}
]
await task.queue_frame(LLMMessagesFrame(system_message))
So imho the root cause why the above message is stored as HumanMessage
is the LanchainProcessor
code here:
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, LLMMessagesFrame):
# Messages are accumulated by the `LLMUserResponseAggregator` in a list of messages.
# The last one by the human is the one we want to send to the LLM.
logger.debug(f"Got transcription frame {frame}")
text: str = frame.messages[-1]["content"]
await self._ainvoke(text.strip())
else:
await self.push_frame(frame, direction)
As can be seen, it does not retrieve the "role" but only "content" to send to the LLM. If the role is not shown, it only sends the text, I guess it defaults to HumanMessage in Langchain which would explain the result.
My understanding is that Langchain stores messages as SystemMessage if the role is specified as "system". I found 3 places that would support this thought:
I searched the latest Langchain docs and found this chapter showing a message passed with system role:
prompt2 = ChatPromptTemplate.from_messages(
[
("system", "really good ai"),
("human", "{input}"),
("ai", "{ai_output}"),
("human", "{input2}"),
]
)
fake_llm = RunnableLambda(lambda prompt: "i am good ai")
chain = prompt1.assign(ai_output=fake_llm) | prompt2 | fake_llm
In the older langchain v0.1 docs there is this hint:
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful AI bot. Your name is Bob."},
{"role": "user", "content": "Hello, how are you doing?"},
{"role": "assistant", "content": "I'm doing well, thanks!"},
{"role": "user", "content": "What is your name?"},
],
)
This github issue for storing SystemMessage shows another method.
In line with the above considerations, I think the invoke method should receive a dictionary containing the role. Something along the following lines:
async def process_frame(self, frame: Frame, direction: FrameDirection):
...
text: str = frame.messages[-1]["content"]
role: str = frame.messages[-1]["role"]
message_dict = { 'role': role, 'input' : text.strip()}
await self._ainvoke(message_dict)
Understood! Now the LangchainProcessor takes a single input and populates the ChatPromptTemplate you supplied with your chain. That is why roles do note make a lot of sense here. You might define an input key transcript
and use it like this in the prompt with your chain:
messages = [("system": "Here's what the user said: {transcript}")]
Input key defaults to input
so more likely would be:
messages = [("system", "Be nice"), ("human", "{input}")]
So not quite sure yet how to offer the possibility to push a list of messages and basically ignore the chain altogether.
Understood! Now the LangchainProcessor takes a single input and populates the ChatPromptTemplate you supplied with your chain.
Yes, I understood from the code that it takes the last message, and transmits the "content" property in this line:
[frame.messages[-1]["content"]](text: str = frame.messages[-1]["content"])
That is why roles do note make a lot of sense here. From the current code you are right but this is very limiting. The whole motivation from using LangchainProcessor is to use a
RunnableWithMessageHistory
which has the advantage of managing the message history automatically and allowing to change the system prompt without redefining the chain.You might define an input key
transcript
and use it like this in the prompt with your chain:messages = [("system": "Here's what the user said: {transcript}")]
Syntactically, this is exactly the idea. The only difference is that you are inserting the user's transcript. If we want to change the system prompt, it would be defined semantically different as:
messages = [("system": "{system_prompt}")]
Input key defaults to
input
so more likely would be:messages = [("system", "Be nice"), ("human", "{input}")]
Yes, this is unfortunately the case with the current code as it only retrieves the content with
text: str = frame.messages[-1]["content"]
However, if it would also retrieve the role
attribute, the sent message could be understood as system message (and thus mapped to Langchain's SystemMessage
class. Tremendous benefit. So that's why I suggested that process_frame
should be extended by
role: str = frame.messages[-1]["role"]
message_dict = { 'role': role, 'input' : text.strip()}
So not quite sure yet how to offer the possibility to push a list of messages and basically ignore the chain altogether.
Can you specify what you mean by "ignore the chain altogether"?
I suppose you don't mean ignoring the message history. Or you meant pushing a list of messages without resending the previous chat history?
This makes sense of course.
And wasn't this Langchain's whole motivation to deprecate ConversationChain
in favor of RunnableWithMessageHistory
?
As far as I understood an engineer who is with Langchain and urged me to use RunnableWithMessageHistory
, one of the many advantages is that it self-manages the message history by the get_session_history
input. If you print it, you can see that all previous message are still contained, although you only sent one message.
Conclusion:
I would appreciate it if you can retrieve and send the role
attribute :)
I just tried my proposed solution but got this warning:
WARNING: Error in RootListenersTracer.on_chain_end callback: ValueError('Expected str, BaseMessage, List[BaseMessage], or Tuple[BaseMessage]. Got {\'role\': \'system\', \'input\': \'Role:\\nYou are an experienced
So it appears that if LMMMessagesFrame contains a message with role system
, you cannot invoke it.
Found out that the root case in the LangchainProcessor code:
sync def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, LLMMessagesFrame):
# Messages are accumulated by the `LLMUserResponseAggregator` in a list of messages.
# The last one by the human is the one we want to send to the LLM.
logger.debug(f"Got transcription frame {frame}")
text: str = frame.messages[-1]["content"]
await self._ainvoke(text.strip())
async def _ainvoke(self, text: str):
logger.debug(f"Invoking chain with {text}")
...
try:
async for token in self._chain.astream(
{self._transcript_key: text},
config={"configurable": {"session_id": self._participant_id}},
):
So in conclusion, the current code assumes only one way to feed Langchain's astream()
which is with a text passed in LMMMessagesFrame
.
The goal is to allow other inputs to astream()
which are basically the reason why someone would need Langchain. One important example from the Langchain API documentation is as follows:
prompt = ChatPromptTemplate.from_messages([
("system", "You're an assistant who's good at {ability}"),
MessagesPlaceholder(variable_name="history"),
("human", "{question}"),
])
chain = prompt | ChatAnthropic(model="claude-2")
with_message_history = RunnableWithMessageHistory(
chain,
get_session_history=get_session_history,
input_messages_key="question",
history_messages_key="history",
...
],
)
with_message_history.invoke(
{"ability": "math", "question": "What does cosine mean?"},
...
)
This example demonstrates one of many use cases where the input is not text but a dictionary of input keys (question
) and placeholders (ability
) contained in the ChatPromptTemplate.
I would appreciate if pipecat could open up the current restrictions of LangchainProcessor to allow the full functionality of Langchain.
Can somebody please make this fix?
This is a really important fix to enable the use of Langchain's RunnableWithMessageHistory
using a system prompt.
Please pipecat contributors, be aware that the current code is not submitting a system prompt with every user prompt - this is crucial for many use cases however!
langchain.py
:async for token in self._chain.astream(
{self._transcript_key: text},
config=configurable
):
async for token in self._chain.astream(
{ self._transcript_key: text, "system": self._system_prompt },
config=configurable
):
# set system prompt in LangchainProcessor init:
self._system_prompt: str | None = system_prompt
I have tested this extensively and it works well (You can test this fix by setting the system prompt as: "Start every sentence with beep-beep.").
I would be grateful that the pipecat contributors could acknowledge the many different use cases for which langchain is chosen. In this case, transmitting the system prompt for every message, and more importantly, changing the system prompt dynamically, are the two crucial use cases for using langchain.
Context
Using
LangchainProcessor
as LLM wrapper, andInMemoryChatMessageHistory
(extendsBaseChatMessageHistory
) as message store.Current behavior
is received in the chat history as:
Expected behavior:
A LLMMessagesFrame with system role should be converted to:
SystemMessage in Langchain docs: