Using GPT 4 or GPT 3.5 with SQL Database Agent throws OutputParserException: Could not parse LLM output:

RamlahAziz commented 1 year ago

System Info

python 3.10 ubuntu Ubuntu 22.04.2 LTS langchain 0.0.194

Who can help?

@eyurtsev

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[X] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[X] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from langchain.agents import create_sql_agent
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
import os

os.environ["OPENAI_API_KEY"] = ""
db = SQLDatabase.from_uri(
    "postgresql://<my-db-uri>",
    engine_args={
        "connect_args": {"sslmode": "require"},
    },
)

llm = ChatOpenAI(model_name="gpt-3.5-turbo")
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

agent_executor = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True,
)

agent_executor.run("list the tables in the db. Give the answer in a table json format.")

Expected behavior

I am using the SQL Database Agent to query a postgres database. I want to use gpt 4 or gpt 3.5 models in the OpenAI llm passed to the agent, but it says I must use ChatOpenAI. Using ChatOpenAI throws parsing errors.

The reason for wanting to switch models is reduced cost, better performance and most importantly - token limit. The max token size is 4k for 'text-davinci-003' and I need at least double that.

When I do, it throws an error in the chain midway saying

> Entering new AgentExecutor chain...
Traceback (most recent call last):
  File "/home/ramlah/Documents/projects/langchain-test/sql.py", line 96, in <module>
    agent_executor.run("list the tables in the db. Give the answer in a table json format.")
  File "/home/ramlah/Documents/projects/langchain/langchain/chains/base.py", line 236, in run
    return self(args[0], callbacks=callbacks)[self.output_keys[0]]
  File "/home/ramlah/Documents/projects/langchain/langchain/chains/base.py", line 140, in __call__
    raise e
  File "/home/ramlah/Documents/projects/langchain/langchain/chains/base.py", line 134, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/home/ramlah/Documents/projects/langchain/langchain/agents/agent.py", line 953, in _call
    next_step_output = self._take_next_step(
  File "/home/ramlah/Documents/projects/langchain/langchain/agents/agent.py", line 773, in _take_next_step
    raise e
  File "/home/ramlah/Documents/projects/langchain/langchain/agents/agent.py", line 762, in _take_next_step
    output = self.agent.plan(
  File "/home/ramlah/Documents/projects/langchain/langchain/agents/agent.py", line 444, in plan
    return self.output_parser.parse(full_output)
  File "/home/ramlah/Documents/projects/langchain/langchain/agents/mrkl/output_parser.py", line 51, in parse
    raise OutputParserException(
langchain.schema.OutputParserException: Could not parse LLM output: `Action: list_tables_sql_db, ''`

If I change the model to gpt-4, it runs one step then throws the error on the Thought for the next step

> Entering new AgentExecutor chain...
Action: list_tables_sql_db
Action Input: 
Observation: users, organizations, plans, workspace_members, curated_topic_details, subscription_modifiers, workspace_member_roles, receipts, workspaces, domain_information, alembic_version, blog_post, subscriptions
Thought:I need to check the schema of the blog_post table to find the relevant columns for social interactions.
Action: schema_sql_db
Action Input: blog_post
Observation: 
CREATE TABLE blog_post (
        id UUID NOT NULL, 
        category VARCHAR(255) NOT NULL, 
        title VARCHAR(255) NOT NULL, 
        slug VARCHAR(255) NOT NULL, 
        introduction TEXT NOT NULL, 
        list_of_blogs JSON[], 
        og_image VARCHAR(255), 
        created_at TIMESTAMP WITHOUT TIME ZONE NOT NULL, 
        updated_at TIMESTAMP WITHOUT TIME ZONE NOT NULL, 
        meta_description TEXT, 
        CONSTRAINT blog_post_pkey PRIMARY KEY (id)
)

/*
3 rows from blog_post table:
*** removing for privacy reasons ***
*/
Thought:Traceback (most recent call last):
  File "/home/ramlah/Documents/projects/langchain-test/sql.py", line 84, in <module>
    agent_executor.run("Give me the blog post that has the most social interactions.")
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/chains/base.py", line 256, in run
    return self(args[0], callbacks=callbacks)[self.output_keys[0]]
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/chains/base.py", line 145, in __call__
    raise e
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/chains/base.py", line 139, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/agents/agent.py", line 953, in _call
    next_step_output = self._take_next_step(
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/agents/agent.py", line 773, in _take_next_step
    raise e
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/agents/agent.py", line 762, in _take_next_step
    output = self.agent.plan(
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/agents/agent.py", line 444, in plan
    return self.output_parser.parse(full_output)
  File "/home/ramlah/Documents/projects/langchain-test/venv/lib/python3.10/site-packages/langchain/agents/mrkl/output_parser.py", line 42, in parse
    raise OutputParserException(
langchain.schema.OutputParserException: Could not parse LLM output: `The blog_post table has a column list_of_blogs which seems to contain the social interaction data. I will now order the rows by the sum of their facebook_shares and twitter_shares and limit the result to 1 to get the blog post with the most social interactions.`

The error is inconsistent and sometimes the script runs normally.

I have tried removing and adding streaming=True thinking that might be the cause.
I have tried changing the model from gpt-3.5-turbo to gpt-4 as well, the error shows up inconsistently

Please let me know if I can provide any further information. Thanks!

glondrej commented 1 year ago

I am observing the same issue. When I tracked the API calls by GraphSignal, it really seems that the completion provided by GTP-3.5-turbo is only Action: list_tables_sql_db, '' and is missing the "Action input:" section, which causes parser error. It does not concern DaVinci model. When the same prompt is passed to LLM via OpenAI Playground webapp, it results in complete and correct answer (both davinci and GPT-3.5-turbo).

F2EVarMan commented 1 year ago

I have the same problem.

naman-gupta99 commented 1 year ago

I have the same problem. Do you suggest moving to da-vinci?

yveshaag commented 1 year ago

same problem. looking for a solution to use GPT 4 with SQL Toolkit

drumwell commented 1 year ago

I ran into this today too. Setting up a SQLDatabaseChain and running my own interactive prompt around that seems to work fine, but the SQL agent is throwing the same kinds of errors as the posters above.

glondrej commented 1 year ago

I think it doesn't make much sense now to invest time in fixing this. Instead it should be implemented using new function calling API.

RamlahAziz commented 1 year ago

I just ended up making my own chain using OpenAI chat completion. I was testing the code again today and the recent updates to langchain seem to have fixed this issue for me at least. For those that are still facing it, someone suggested passing handle_parsing_errors=true to the create_sql_agent function here

BrettlyCD commented 1 year ago

I'm getting this same error while trying to use a HuggingFace model - does anybody know if the create_sql_agent works with non-OpenAI models?

I did my best to look through the docs and code and don't see why it wouldn't work with other integrations, but thought I'd double check. Thanks!

ameetkonnur commented 1 year ago

I am facing the same issue with AzureOpenAI. Below is the code snippet. Tried using some of the recommendations above but they don't seem to fix the issue.

Info

Python : 3.10.6 Langchain : 0.0.266

Error

langchain.schema.output_parser.OutputParserException: Could not parse LLM output: Action: sql_db_list_tables, ''

Code

db = SQLDatabase.from_uri(url_object) llm = AzureOpenAI(model_name="gpt-35-turbo",deployment_name="gpt35",temperature=0, verbose=True) toolkit=SQLDatabaseToolkit(db=db, llm=llm)

agent_executor = create_sql_agent( llm=llm, toolkit=toolkit, verbose=True, handle_parsing_errors=True )

agent_executor.run("List the total sales per country. Which country's customers spent the most?")

vinayemmadi commented 11 months ago

I see the same error when using the fine-tuned gpt 3.5 model with create_sql_agent. Was anyone able to fix this?

7effrey89 commented 11 months ago

Im also facing this issue at random, @RamlahAziz suggestion helps avoid the code from crashing so that is very welcome, but nevertheless it's not a real fix

dosubot[bot] commented 8 months ago

Hi, @RamlahAziz

I'm helping the LangChain team manage their backlog and am marking this issue as stale. It seems that recent updates to Langchain have resolved the parsing error you were encountering when using GPT-4 or GPT-3.5 models with the SQL Database Agent. Additionally, passing handle_parsing_errors=true to the create_sql_agent function has been suggested as a workaround.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

Keekteng commented 8 months ago

I am also facing this issue

dosubot[bot] commented 8 months ago

@baskaryan, could you please assist @Keekteng with the parsing error issue they are encountering when using GPT-4 or GPT-3.5 models with the SQL Database Agent in the LangChain repository? Thank you!

claraisrael commented 7 months ago

im also facing this issue

max-d3v commented 7 months ago

@baskaryan, could you please assist @Keekteng with the parsing error issue they are encountering when using GPT-4 or GPT-3.5 models with the SQL Database Agent in the LangChain repository? Thank you! Facing the same issue, I passed parsing errors as true and even tried increasing max_iterations to 1000, but for no luck, For my case with bigger iteration cicles one AI action is passed as an error

dosubot[bot] commented 7 months ago

🤖

I'm glad to hear that the recent updates to LangChain have resolved the issue for you! It's always great to see improvements making a positive impact. Thank you for sharing the tip about passing handle_parsing_errors=true to the create_sql_agent function. This could be helpful for others encountering similar issues. If you have any more feedback or questions about LangChain, feel free to ask!

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

udaysai50 commented 6 months ago

Hello Everyone,

handle_parsing_errors doesn't seem to work in latest versions and I am getting following warning

/home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/langchain_community/agent_toolkits/sql/base.py:138: UserWarning: Received additional kwargs {'handle_parsing_errors': True} which are no longer supported.
  warnings.warn(

Any alternatives to fix?

Sinnaeve commented 5 months ago

Hello Everyone,

handle_parsing_errors doesn't seem to work in latest versions and I am getting following warning
/home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/langchain_community/agent_toolkits/sql/base.py:138: UserWarning: Received additional kwargs {'handle_parsing_errors': True} which are no longer supported.
  warnings.warn(
Any alternatives to fix?

I think now it goes into agent_executor_kwargs argument like this:

agent_executor = create_sql_agent(llm,
                                   db=db,
                                   agent_type="openai-tools",
                                   agent_executor_kwargs={'handle_parsing_errors':True},
                                   verbose=True)

RamlahAziz commented 4 months ago

agent_executor = create_sql_agent(llm,
                                   db=db,
                                   agent_type="openai-tools",
                                   agent_executor_kwargs={'handle_parsing_errors':True},
                                   verbose=True)

This seems to be the latest way to handle it according to the langchain repository

CellCS commented 1 week ago

agent_executor_kwargs={'handle_parsing_errors':True},

if use "toolkit" in create_sql_agent, still does not work.

agent = create_sql_agent(
                llm=lm,
                toolkit=sql_toolkit,
                agent_type=agent_type,
                max_iterations=max_iterations,
                verbose=True,
                agent_executor_kwargs={'handle_parsing_errors':True},
        )

but if there is no toolkit, works. so this is still an issue?

langchain-ai / langchain