Currently, the implementation for conversational RAG compresses all past chat messages into a single user message when using the OpenAIChatGenerator or similar components. I think this approach differs from common chat generator functionalities, where models generally benefit from receiving alternating messages from the user and assistant as separate entries. Providing this structure might help maintain conversational context more effectively.
Feature Request:
Allow the entire list of past chat messages to be provided to the LLM, rather than compressing them into a single user message. This ensures that each user query and assistant response is treated as separate ChatMessage objects, improving the context preservation and enabling the model to follow the conversational flow more naturally.
Description:
Currently, the implementation for conversational RAG compresses all past chat messages into a single user message when using the OpenAIChatGenerator or similar components. I think this approach differs from common chat generator functionalities, where models generally benefit from receiving alternating messages from the user and assistant as separate entries. Providing this structure might help maintain conversational context more effectively.
Feature Request:
Allow the entire list of past chat messages to be provided to the LLM, rather than compressing them into a single user message. This ensures that each user query and assistant response is treated as separate ChatMessage objects, improving the context preservation and enabling the model to follow the conversational flow more naturally.