Closed Ghat0tkach closed 3 days ago
Yeah thats odd. Can you share with me your assistant code? I know you said you were using RAG but I need to be able to replicate this to debug. Or just list out the steps I can try to debug from that?
Thanks @jacobweiss2305 . The odd thing is its very difficult to reproduce. I have created a custom fastapi route for auto-rag cookbook example with structured output.
def get_auto_rag_assistant(
llm_model: str = "gpt-4-turbo",
user_id: Optional[str] = None,
run_id: Optional[str] = None,
debug_mode: bool = True,
) -> Assistant:
"""Get an Auto RAG Assistant."""
return Assistant(
name="auto_rag_assistant",
run_id=run_id,
user_id=user_id,
llm=OpenAIChat(model=llm_model),
storage=PgAssistantStorage(table_name="auto_rag_assistant_openai", db_url=db_url),
output_model=response_model,
knowledge_base=AssistantKnowledge(
vector_db=PgVector2(
db_url=db_url,
collection="auto_rag_documents_openai",
embedder=OpenAIEmbedder(model="text-embedding-3-small", dimensions=1536),
),
# 3 references are added to the prompt
num_documents=3,
),
description="You are a helpful Assistant called 'AutoRAG' and your goal is to assist the user in the best way possible.",
instructions=[
"Given a user query, first ALWAYS search your knowledge base using the `search_knowledge_base` tool to see if you have relevant information.",
"If you dont find relevant information in your knowledge base, use the `duckduckgo_search` tool to search the internet.",
"If you need to reference the chat history, use the `get_chat_history` tool.",
"If the users question is unclear, ask clarifying questions to get more information.",
"Carefully read the information you have gathered and provide a clear and concise answer to the user.",
"Do not use phrases like 'based on my knowledge' or 'depending on the information'.",
],
# Show tool calls in the chat
show_tool_calls=True,
# This setting gives the LLM a tool to search the knowledge base for information
search_knowledge=True,
# This setting gives the LLM a tool to get chat history
read_chat_history=True,
tools=[DuckDuckGo()],
# This setting tells the LLM to format messages in markdown
markdown=True,
# Adds chat history to messages
add_chat_history_to_messages=True,
add_datetime_to_instructions=True,
debug_mode=debug_mode,
)
class Question(BaseModel):
text: str
user_id: str
llm_model: str
thread_id: Optional[str] = None
class Assistant_Response(BaseModel):
question: str
answer: str
table: Optional[str] = None
class Answer(Assistant_Response):
threadId: str
@router.post("/get-result", response_model=Answer)
async def ask_question(question: Question):
try:
response = get_answer_from_assistant(question)
return response
except ValueError as ve:
raise HTTPException(status_code=422, detail=str(ve))
except Exception as e:
print(e)
raise HTTPException(status_code=500, detail=str(e))
def get_answer_from_assistant(question: Question) -> Answer:
assistant_kwargs = {
"user_id": question.user_id,
"llm_model": question.llm_model,
"response_model": Assistant_Response
}
if question.thread_id is None:
assistant = get_auto_rag_assistant(**assistant_kwargs)
thread_id = assistant.create_run()
else:
assistant_kwargs["run_id"] = question.thread_id
assistant = get_auto_rag_assistant(**assistant_kwargs)
thread_id = question.thread_id
response = assistant.run(question.text, stream=False)
if not isinstance(response, Assistant_Response):
raise ValueError("Invalid response type from assistant")
answer = Answer(
question=response.question,
answer=response.answer,
table=response.table,
threadId=thread_id
)
return answer
Hi, I’m using the Phidata assistant with two tools: search_knowledge_base and duckduckgo. I have used the Auto-RAG cookbook example with structured json output and am facing the following issue: I am getting incomplete responses. Even after a successful tool call, the LLM provides incomplete responses. I haven’t set a max token usage either.
Here is a detailed log of the messages
the following is the final answer