langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.04k stars 14.65k forks source link

AdaptiveRAG implementation does'nt work with AzureOpenAI(llm.`with_structured_output`) Error #20548

Closed pratikkotian04 closed 1 month ago

pratikkotian04 commented 4 months ago

Checked other resources

Example Code

Docs to index

urls = [ "https://lilianweng.github.io/posts/2023-06-23-agent/", "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/", "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/", ]

Load

docs = [WebBaseLoader(url).load() for url in urls] docs_list = [item for sublist in docs for item in sublist]

Split

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder( chunk_size=500, chunk_overlap=0 ) doc_splits = text_splitter.split_documents(docs_list)

Add to vectorstore

vectorstore = Chroma.from_documents( documents=doc_splits, collection_name="rag-chroma", embedding=embeddings, ) retriever = vectorstore.as_retriever()

Data model

class RouteQuery(BaseModel): """Route a user query to the most relevant datasource."""

datasource: Literal["vectorstore", "web_search"] = Field(
    ...,
    description="Given a user question choose to route it to web search or a vectorstore.",
)

LLM with function call

llm = AzureChatOpenAI(azure_deployment='chatgpt3', model="gpt-3.5-turbo-0125", temperature=0) structured_llm_router = llm.with_structured_output(RouteQuery)

Prompt

system = """You are an expert at routing a user question to a vectorstore or web search. The vectorstore contains documents related to agents, prompt engineering, and adversarial attacks. Use the vectorstore for questions on these topics. Otherwise, use web-search.""" route_prompt = ChatPromptTemplate.from_messages( [ ("system", system), ("human", "{question}"), ] )

question_router = route_prompt | structured_llm_router print(question_router.invoke({"question": "Who will the Bears draft first in the NFL draft?"})) print(question_router.invoke({"question": "What are the types of agent memory?"}))

Error Message and Stack Trace (if applicable)

C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core_api\beta_decorator.py:87: LangChainBetaWarning: The function with_structured_output is in beta. It is actively being worked on, so the API may change. warn_beta( Traceback (most recent call last): File "C:\Users\prakotian\Desktop\Projects\GenAI Projects\AdaptiveRAG\router.py", line 87, in print(question_router.invoke({"question": "Who will the Bears draft first in the NFL draft?"})) File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\runnables\base.py", line 2499, in invoke input = step.invoke( File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\output_parsers\base.py", line 169, in invoke return self._call_with_config( File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\runnables\base.py", line 1625, in _call_with_config context.run( File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\runnables\config.py", line 347, in call_func_with_variable_args return func(input, **kwargs) # type: ignore[call-arg] File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\output_parsers\base.py", line 170, in lambda inner_input: self.parse_result( File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\output_parsers\openai_tools.py", line 182, in parse_result json_results = super().parse_result(result, partial=partial) File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\output_parsers\openai_tools.py", line 129, in parse_result tool_calls = parse_tool_calls( File "C:\Users\prakotian\AppData\Local\miniconda3\envs\Py10\lib\site-packages\langchain_core\output_parsers\openai_tools.py", line 85, in parse_tool_calls raise OutputParserException("\n\n".join(exceptions)) langchain_core.exceptions.OutputParserException: Function RouteQuery arguments:

{ datasource: "web_search" }

are not valid JSON. Received JSONDecodeError Expecting property name enclosed in double quotes: line 2 column 3 (char 4)

Description

Expected Output is { datasource: "web_search" }

System Info

System Information

OS: Windows OS Version: 10.0.19045 Python Version: 3.10.13 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:24:38) [MSC v.1916 64 bit (AMD64)]

Package Information

langchain_core: 0.1.43 langchain: 0.1.16 langchain_community: 0.0.33 langsmith: 0.1.31 langchain_cohere: 0.1.2 langchain_experimental: 0.0.54 langchain_openai: 0.1.3 langchain_text_splitters: 0.0.1 langchainhub: 0.1.15 langgraph: 0.0.37

ggkenios commented 4 months ago

I got similar parsing issues with:

Package Information

nthe commented 4 months ago

@pratikkotian04 Seems like gpt-3.5-turbo-0125 doesn't generate valid JSON. Is this happening at every invocation or just sometimes? Are you able to provide example of logs showing what exactly is being sent to LLM? Especially the complete rendered prompt.