langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.39k stars 14.77k forks source link

Issue: Token Limit Exceeded Error in SQL Database Agent #4293

Closed jeyarajcs closed 11 months ago

jeyarajcs commented 1 year ago

Issue you'd like to raise.

I am trying to work with Snowflake using create_sql_agent Very often getting token limit error.

This is my code

from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from langchain.llms.openai import OpenAI
from langchain.agents import AgentExecutor

from sqlalchemy.dialects import registry
registry.load("snowflake")

account_identifier = 'xxxx'
user = 'xxxx'
password = 'xxxx'
database_name = 'xxxx'
schema_name = 'xxxx'
warehouse_name = 'xxxx'
role_name = 'xxxx'

conn_string = f"snowflake://{user}:{password}@{account_identifier}/{database_name}/{schema_name}?warehouse={warehouse_name}&role={role_name}"
db = SQLDatabase.from_uri(conn_string)
print("DB===", db)
toolkit = SQLDatabaseToolkit(llm=OpenAI(temperature=0), db=db)

agent_executor = create_sql_agent(
    llm=OpenAI(temperature=0),
    toolkit=toolkit,
    verbose=True
)

agent_executor.run("Which companies are getting the most reviews in a specific category?")

If I ask straightforward question on a tiny table that has only 5 records, Then the agent is running well. If the table is slightly bigger with complex question, It throws InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 13719 tokens (13463 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

Suggestion:

No response

jeyarajcs commented 1 year ago

I think the issue is from SQLDatabaseToolkit. This schema_sql_db tool takes comma-separated list of tables as input, output is the schema and sample rows for those tables. If the query is a bit lengthy or complex, then this tool returns pretty much all the table schema with sample rows. I think that's where we hit the token limit error.

But I don't know How to deal with this issue?

ijasnahamed commented 1 year ago

I am also facing the same issue

jeyarajcs commented 1 year ago

@hwchase17 Any thoughts? Any workaround to pass this?

altcoder commented 1 year ago

There are a couple of parameters you can adjust to limit the table information that's going to be included in the prompt.

SQLDatabase.from_uri(
    conn_string,
    sample_rows_in_table_info=1, # default 3
    include_tables=["my_table_1"],  # will restrict to just using my_table_1
)

You can also use ignore_tables to limit the tables to include in the table info.

rajendradewani commented 1 year ago

I am facing the same issue, I have included two tables. sample_rows_in_table_info=1, # default 3 include_tables=['tblSales', 'tblCustomer'] The customer table contains 125 columns The Sales table contains 102 columns.

Question: List top 10 customers based on total sales in this year. This will link both tables and thus fields of both tables will be sent to OpenAI.

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 7603 tokens (7347 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

steven-levey-sp commented 1 year ago

I have the same issue. I have tables that have 100+ columns and the prompts being generated are just too large.

I might be crazy, but I am thinking about putting a vector db in front of this and then having the query first pull out the columns it thinks are relevant before building up the prompt.

willsmithorg commented 1 year ago

The agent has to query the list of tables on the database, then the schema for the most relevant tables. If you have lots of relevant sounding tables, or big schemas for some tables, you hit the limit of the prompt length.

Immediate options are:

foshesss commented 1 year ago

Hi Will! Is there a method for tracking token usage as the program runs, specifically with a SQL agent? It's very difficult to identify the source of large token usage. I feel like if there was a way to monitor this, it would give me better insights when it comes to handling this problem.

GBBBAS commented 1 year ago

Switching to the gpt-3.5-turbo-16k-0613 model and then setting agent_type=AgentType.OPENAI_FUNCTIONS in the create_sql_agent method increases the token limit to 16k and gets better responses for me. Still not ideal but does help.

You need to include the following import for the AgentType enum:

from langchain.agents.agent_types import AgentType

Also note the AgentType.OPENAI_FUNCTIONS only worked well for me with the 0613 models. This behaviour is related to this documentation: https://python.langchain.com/docs/modules/agents/agent_types/openai_functions_agent I get a lot of parsing errors if I use that AgentType setting with the standard gpt-3.5-turbo model.

rcarrington-lenvi commented 1 year ago

Am I missing something or do these limits make this functionally useless for most real world case? Unless you're looking at a very narrow focus for questions or have a tiny database schema then you're going to run into this limit. I've just been trying to get this working with the Microsoft sample database AdventureWorks which is not particularly big in terms of schema, certainly way smaller than any of our product db schemas, and even that is far too large for the 4K limit on the model I'm using. Switching to a model with a 16K limit would work in this particular example but it's still going to be too small for any reasonably sized schema. Unless you've got a narrow focus for questions you can't limit tables/columns as you don't know what's going to be asked. If you do know what is going to be asked then why even bother with AI?

I'm not trying to be antagonistic with this question - I'm genuinely curious what the application of this tool is with these limitations?

medecau commented 1 year ago

I'm trying to use the SQL agent at $WORK, and like others we're hitting the token limits. already trying to use gpt-3.5-turbo-16k but even that is not enough to handle the 34k tokens sometimes necessary to handle the results from SQL queries.

We're experimenting with the ability of the Plan & Execute agent to complete tasks, and using the SQL agent as a toolkit.get_tools() provider to the P&E agent. The task we're trying to accomplish is getting the agent to complete the 'SQL Murder Mystery' challenge.

The agent is currently not able to complete the challenge.

medecau commented 1 year ago

I got it working for my use case, maybe the snippet includes enough hints for others to pick up from.

from typing import List, Optional

from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.callbacks.manager import CallbackManagerForToolRun
from langchain.chat_models import ChatOpenAI
from langchain.experimental.plan_and_execute import (
    PlanAndExecute,
    load_agent_executor,
    load_chat_planner,
)
from langchain.sql_database import SQLDatabase
from langchain.text_splitter import TokenTextSplitter
from langchain.tools import BaseTool
from langchain.tools.sql_database.tool import QuerySQLDataBaseTool

model = ChatOpenAI(model="gpt-3.5-turbo-16k", temperature=0)
token_limit = 16_000

db = SQLDatabase.from_uri("sqlite:///my_database.db")

def split_text(text, chunk_size, chunk_overlap=0):
    text_splitter = TokenTextSplitter(
        chunk_size=chunk_size, chunk_overlap=chunk_overlap
    )
    yield from text_splitter.split_text(text)

class QuerySQLDatabaseTool2(QuerySQLDataBaseTool):
    def _run(
        self,
        query: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        result = self.db.run_no_throw(query)

        return next(split_text(result, chunk_size=14_000))

class SQLDatabaseToolkit2(SQLDatabaseToolkit):
    def get_tools(self) -> List[BaseTool]:
        tools = super().get_tools()
        original_query_tool_description = tools[0].description
        new_query_tool = QuerySQLDatabaseTool2(
            db=self.db, description=original_query_tool_description
        )
        tools[0] = new_query_tool

        return tools

toolkit = SQLDatabaseToolkit2(db=db, llm=model)

planner = load_chat_planner(model)

executor = load_agent_executor(model, toolkit.get_tools(), verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor, verbose=True)

agent.run(my_request)

You'll need tiktoken: pip install tiktoken or poetry add tiktoken.

pumudu88 commented 1 year ago

I got it working for my use case, maybe the snippet includes enough hints for others to pick up from.

from typing import List, Optional

from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.callbacks.manager import CallbackManagerForToolRun
from langchain.chat_models import ChatOpenAI
from langchain.experimental.plan_and_execute import (
    PlanAndExecute,
    load_agent_executor,
    load_chat_planner,
)
from langchain.sql_database import SQLDatabase
from langchain.text_splitter import TokenTextSplitter
from langchain.tools import BaseTool
from langchain.tools.sql_database.tool import QuerySQLDataBaseTool

model = ChatOpenAI(model="gpt-3.5-turbo-16k", temperature=0)
token_limit = 16_000

db = SQLDatabase.from_uri("sqlite:///my_database.db")

def split_text(text, chunk_size, chunk_overlap=0):
    text_splitter = TokenTextSplitter(
        chunk_size=chunk_size, chunk_overlap=chunk_overlap
    )
    yield from text_splitter.split_text(text)

class QuerySQLDatabaseTool2(QuerySQLDataBaseTool):
    def _run(
        self,
        query: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        result = self.db.run_no_throw(query)

        return next(split_text(result, chunk_size=14_000))

class SQLDatabaseToolkit2(SQLDatabaseToolkit):
    def get_tools(self) -> List[BaseTool]:
        tools = super().get_tools()
        original_query_tool_description = tools[0].description
        new_query_tool = QuerySQLDatabaseTool2(
            db=self.db, description=original_query_tool_description
        )
        tools[0] = new_query_tool

        return tools

toolkit = SQLDatabaseToolkit2(db=db, llm=model)

planner = load_chat_planner(model)

executor = load_agent_executor(model, toolkit.get_tools(), verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor, verbose=True)

agent.run(my_request)

You'll need tiktoken: pip install tiktoken or poetry add tiktoken.

@medecau were this script able to solve 'murder mystery challenge' on its own? i tried this approach and does not hit max token limitation but deviates a lot from the original query.

medecau commented 1 year ago

sorry, we do not have AGI here.

I tried with both gpt-3.5-turbo-16k and got-4 and both failed.


I wish these agents were a bit easier to tweak, so we could have a meta-agent for each of the steps of the plan-and-execute agent. As is these seem almost un-approachable.

PS: ChatGPT (GPT-4 + Code Interpreter) gets very close to resolving the sql-murder-mystery challenge. I had to guide it a bit towards using the right tools, completing the answer, and later to better explore one of the leads. See: https://chat.openai.com/share/52b550ce-90ae-4ded-9364-2e1bdbdfa302

dosubot[bot] commented 11 months ago

Hi, @jeyarajcs! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you reported is about encountering a "Token Limit Exceeded Error" when using the create_sql_agent function in Snowflake. It seems that the error occurs when the table is larger and the question is more complex, exceeding the maximum token limit of the model. Some users have suggested adjusting parameters to limit the table information included in the prompt, using a vector db to pull out relevant columns before building the prompt, or using ChatGPT as a few-shot learner. Additionally, some users have switched to the gpt-3.5-turbo-16k-0613 model to increase the token limit.

Before we proceed, we would like to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and cooperation!

therajatg commented 9 months ago

Has anyone yet found the solution to this issue?

jamshid17 commented 8 months ago

Hi Will! Is there a method for tracking token usage as the program runs, specifically with a SQL agent? It's very difficult to identify the source of large token usage. I feel like if there was a way to monitor this, it would give me better insights when it comes to handling this problem.

Hey @foshesss, any updates on that? I am trying to track token usage with tiktoken, but haven't seen any real use cases with langchain agent.

geeman11 commented 4 months ago

It's a shame this is still a very real issue. Has anyone looked into using MPT-7B's model to do this? 60k token count should be enough for normal use cases, right? I do not have a computer fast enough to run this model so im not sure how it handles creating SQL queries. Would love to hear from anyone who has accomplished this though!

LasalJayawardena commented 4 months ago

@geeman11 in models with big context windows accuracy drops significantly, as the context size increases.