Open fawazPhc opened 2 weeks ago
Hey there, @fawazPhc! How can I assist you today?
The issue you're encountering with the ReActAgent
hallucinating during casual conversations and confusing retrieved information with its internal knowledge can be addressed by refining the agent's configuration and handling of tool responses. Here are some steps to mitigate this issue:
Adjust the Ollama
Model Parameters: Fine-tune the parameters of the Ollama
model to improve its handling of casual conversations. Specifically, you can adjust the temperature
and context_window
parameters to make the model's responses more deterministic and maintain context over longer conversations.
llm = Ollama(model="llama3", request_timeout=1000000, temperature=0.5, context_window=1024)
Use Node Postprocessors: Implement node postprocessors to filter and validate the retrieved nodes before they are synthesized into a response. This can help reduce hallucinations by ensuring only relevant information is used.
from llama_index.core.node_postprocessors import BaseNodePostprocessor
class CustomNodePostprocessor(BaseNodePostprocessor):
def postprocess_nodes(self, nodes, query_bundle):
# Implement custom filtering logic here
return [node for node in nodes if self.is_relevant(node)]
def is_relevant(self, node):
# Define relevance criteria
return True
retriever = VectorIndexRetriever(
index=vector_index,
similarity_top_k=5,
node_postprocessors=[CustomNodePostprocessor()]
)
retriever_2 = VectorIndexRetriever(
index=vector_index_2,
similarity_top_k=5,
node_postprocessors=[CustomNodePostprocessor()]
)
Separate Contexts for Different Tools: Ensure that the agent maintains separate contexts for different tools to avoid confusion between casual conversation and data-specific queries. This can be achieved by using a context-augmented agent.
from llama_index.core import Document
from llama_index.agent.openai_legacy import ContextRetrieverOpenAIAgent
context_texts = [
"Casual conversation context: Greetings, small talk, etc.",
"Data-specific context: Information about company XYZ, etc."
]
context_docs = [Document(text=t) for t in context_texts]
context_index = VectorStoreIndex.from_documents(context_docs)
context_agent = ContextRetrieverOpenAIAgent.from_tools_and_retriever(
tools,
context_index.as_retriever(similarity_top_k=1),
verbose=True,
)
Verbose Mode for Debugging: Enable verbose mode to get detailed logs and understand how the agent is processing the queries and selecting tools. This can help identify and fix issues related to tool selection and response generation.
agent = ReActAgent.from_tools(
tools=tools,
llm=llm,
verbose=True,
)
Handle Tool Responses Carefully: Ensure that the agent correctly handles the responses from the tools and integrates them into its reasoning process without confusing them with its internal knowledge.
def _extract_reasoning_step(self, output, is_streaming=False):
if output.message.content is None:
raise ValueError("Got empty message.")
message_content = output.message.content
current_reasoning = []
try:
reasoning_step = self._output_parser.parse(message_content, is_streaming)
except BaseException as exc:
raise ValueError(f"Could not parse output: {message_content}") from exc
if self._verbose:
print_text(f"{reasoning_step.get_content()}\n", color="pink")
current_reasoning.append(reasoning_step)
if reasoning_step.is_done:
return message_content, current_reasoning, True
reasoning_step = cast(ActionReasoningStep, reasoning_step)
if not isinstance(reasoning_step, ActionReasoningStep):
raise ValueError(f"Expected ActionReasoningStep, got {reasoning_step}")
return message_content, current_reasoning, False
By following these steps, you can improve the handling of casual conversations and reduce hallucinations in the ReActAgent
[1][2][3].
To continue talking to Dosu, mention @dosu.
+1 on this. I had to talk to the raw llm and add my previous messages to the raw query string, instead of using ReactAgent since it kept doing the weirdest/most unnecessary looping to a simple "Hi".
To add to this, the llm is handling the message just fine so it looks like it's the ReActAgent thats causing issues.
Question Validation
Question
I'm new to creating a custom ReACT agent to query my data using the RAG technique. It turns out that my locally running LLama2 7B model(Using Ollama) does fine with questions about my data(I added the retriever as a Queryengine tool) but for casual conversations, it ends up calling the tool nevertheless and hallucinates. To overcome I created a separate index for casual conversations and added the retriever as another query engine tool. As expected when I start the casual conversation it does end up calling the specified tool but later hallucinates by talking to itself(at least that's what I think). Is it because I'm using a lightweight local model or do I have to change how I approach it? Here is my code. Since the ReACT used to call tools if I started a casual convo, I had to add another index of casual conversations in the pinecone db. The data I used,is also attached.
I want the model to use its already pre-built-in knowledge like casual convo and reply based on that instead of relying on a tool, even if it does rely on the tool it doesn't use the information I provided. Its like the agent is using the information it retrieves as part of its internal thinking process and mistakenly treating it as a new input rather than forming a final response.You can view the attached image for more clarity.Agent is able to respod to a "hi" using the tool but it hallucinates when I ask "how are you doing?"