Issue: Large observation handling limit

haifengwu-msft commented 1 year ago

Large observation handling limit.

Hey langchain community,

I have a tool which takes a database query as input and does database query. This is similar to what QuerySQLDataBaseTool does. The problem is the output of the query is out of control, it can be large and the agent exceeded the token limit.

The solution I have tried:

Do pagination:
- Chunk the large output, summarize each chunk according to the target question
- Combine all chunks' summarization, which is much smaller than the original output.
Problems:
- Even though I did the summarization according to the target question, the summarization will still lose information.
- The pagination can be slow.
Vectorization:
- Chunk the large output
- Embed each chunk and put them into a Vector DB.
- Do a similarity search based on the target question, and take number of chunks within the token limit.
Problems:
- The embedding take times, so it can be slow for a single thought.
- The output of the query is semantic continuously as a a whole, the chunks can break the semantic meaning.

Does anyone have a solution for this problem? I appreciated any idea!

Suggestion:

No response

dosubot[bot] commented 1 year ago

Answer generated by a 🤖

Answer

I found related issues that might be helpful. I did my best to summarize the solutions, but I recommend looking at the related issues yourself.

Related Issues

Closed Issues

SQL database with metadata exceeding 4000 token limit of Open AI.

The suggested solution is: Use the SQLDatabaseSequentialChain method as mentioned in the LangChain documentation. This method might help in handling the issue with the token limit. You can find more details at https://python.langchain.com/en/latest/modules/chains/examples/sqlite.html#sqldatabasesequentialchain.

Open Issues

Issue: create_pandas_dataframe_agent token size issue

The suggested solution is:

To handle the token size issue when using the create_pandas_dataframe_agent with GPT-3.5 Turbo, you can try the following approaches:

Truncate or shorten the input text to fit within the token limit. You can use the get_num_tokens_from_messages function provided in the context to calculate the number of tokens in your input and adjust accordingly.
Split the input text into smaller chunks and process each chunk separately. You can then combine the results as needed.

However, there are no built-in features like chain_type='refine' or 'map_reduce' in the LangChain framework for this specific issue. You will need to implement the above approaches manually.

bug(sql_agent): Optimise token usage for user questions which require information from more than one table

No discussion on issue

Having token limit issue while using sql agents when the data returned by the agent query on the database is too large

No solution found.

Issue: Token Limit Exceeded Error in SQL Database Agent

The suggested solution is:

Switching to the gpt-3.5-turbo-16k-0613 model and then setting agent_type=AgentType.OPENAI_FUNCTIONS in the create_sql_agent method increases the token limit to 16k and gets better responses.

You need to include the following import for the AgentType enum:

from langchain.agents.agent_types import AgentType

Also note the AgentType.OPENAI_FUNCTIONS only worked well with the 0613 models. This behaviour is related to this documentation: https://python.langchain.com/docs/modules/agents/agent_types/openai_functions_agent You may get a lot of parsing errors if you use that AgentType setting with the standard gpt-3.5-turbo model.

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

sdelgadoc commented 1 year ago

There's no way around the token limit, so we need to get creative. Two potential solutions are to truncate or summarize the data.

An example of how to concatenate can be found in the O365SearchEmails tool, where truncation is an input parameter, which limits the size of the return data. Because LangChain knows that the data isn't complete, it can start with truncated results, and then query for a smaller subset.

Summarizing is a little more complex. In this case, you would send the data to ChatGPT, and have it summarize it in prose with one function, and then use another function to use that prose summary to run a smaller query.

Hope that helps!

haifengwu-msft commented 1 year ago

Because LangChain knows that the data isn't complete, it can start with truncated results, and then query for a smaller subset.

Hey sdelgadoc, thank you for replying! I understand the implementation of O365SearchEmails tool, that's based on body already have a short version: body_preview. But my data I got from DB with the query is raw data.

How does LangChain(agent) know the data isn't complete? then query for a smaller subset does that mean the following thoughts continuously re-trigger the tool and craft a new query which will get a smaller output?

sdelgadoc commented 1 year ago

"... does that mean the following thoughts continuously re-trigger the tool and craft a new query which will get a smaller output?"

Yes, exactly. For example, you could write a function to query the database with three extra parameters:

Truncate: To make sure you don't exceed the limit
Max_Results: The maximum number of results to return in one query
Page: The pagination value so you collect all the data in multiple calls Then, the chain or agent would iterate through the whole data using multiple calls.

Hope that helps!

rachaelsingleton commented 1 year ago

Did you have any breakthroughs on this? Did his workaround work? Curious if it took a substantial amount of time to run through all the truncated results.

dosubot[bot] commented 10 months ago

Hi, @haifengwu-msft! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about handling large observations in a tool that takes a database query as input. You have tried pagination and vectorization as solutions, but both have their drawbacks. Other users have suggested using the SQLDatabaseSequentialChain method, truncating or summarizing the data, and querying for a smaller subset of results. There is also a question about how LangChain knows the data isn't complete and a request for updates on any breakthroughs or successful workarounds.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

langchain-ai / langchain