langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
95.2k stars 15.44k forks source link

Handling JSONDecodeError; Invalid /escape in JsonOutputParser #26655

Open hadifar opened 2 months ago

hadifar commented 2 months ago

Checked other resources

Example Code

import json

from langchain_core.output_parsers import JsonOutputParser
from langchain_core.runnables import RunnableLambda

def fake_llm1(inputs):
    return json.dumps(str({
                              "response": "The user question does not align with the structure or data within our database tables. The tables are relational in terms and follow the naming convention of 'galvatron\' or'mac_\', however the user question appears unrelated to the type of data and structure of the tables."}))

def fake_llm2(inputs):
    return str({
        "response": "The user question does not align with the structure or data within our database tables. The tables are relational in terms and follow the naming convention of 'galvatron\' or'mac_\', however the user question appears unrelated to the type of data and structure of the tables."})

chain1 = RunnableLambda(fake_llm1) | JsonOutputParser()
chain2 = RunnableLambda(fake_llm2) | JsonOutputParser()

# execute 
print(chain1.invoke(""))
# throw exception
print(chain2.invoke(""))

Error Message and Stack Trace (if applicable)

return super().parse_result(result, partial=partial)

File "langchain_core\output_parsers\json.py", line 87, in parse_result raise OutputParserException(msg, llm_output=text) from e langchaincore.exceptions.OutputParserException: Invalid json output: {'response': "The user question does not align with the structure or data within our database tables. The tables are relational in terms and follow the naming convention of 'galvatron' or'mac', however the user question appears unrelated to the type of data and structure of the tables."}

Description

I'm not entirely sure if it is a bug. I found it really strange that when string contains escape character we have such an issue. We can resolve the issue by dumping it to JSON first.

System Info

System Information


OS: Windows
OS Version: 10.0.19045
Python Version: 3.10.10 (tags/v3.10.10:aad5f6a, Feb 7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]

Package Information


langchain_core: 0.3.1
langchain: 0.3.0
langchain_community: 0.3.0
langsmith: 0.1.123
langchain_experimental: 0.0.55 langchain_huggingface: 0.1.0 langchain_nomic: 0.1.2
langchain_ollama: 0.1.0 langchain_openai: 0.1.20 langchain_text_splitters: 0.3.0 langchainhub: 0.1.20 langgraph: 0.2.22

ashvin-a commented 2 months ago

I'm not sure whether this is a bug. @eyurtsev What do you think?

mgs28 commented 1 month ago

This is a python issue with str and json. str() by design converts python dictionaries to use single quotes for keys which is not allowed in json spec. When you try to loads a str() of a dictionary then you will get a JSONParse error.

JsonOutputParser() could handle this with replacing all single quotes with double quotes but that carries some risk. The json utility already has to handle mismatched quotes, etc. It feels unlikely that an LLM returns a json with single quoted properties. FYI - @ashvin-a @eyurtsev