Closed leavestylecode closed 10 months ago
๐ค
Hello,
Thank you for raising this issue. The error you're encountering is due to the fact that the total number of tokens being sent to OpenAI is exceeding the model's maximum context length of 4097 tokens. This can happen when you're trying to send a large amount of data from a Graph DB at once.
There are a few ways to resolve this issue:
max_tokens
parameter: You can adjust the max_tokens
parameter in the OpenAI
instance of the LLMChain
. This parameter determines the maximum number of tokens to generate in the completion. However, the total token count of your prompt and max_tokens
cannot exceed the model's context length. Most models have a context length of 2048 tokens, but newer models, like the one you're using, support up to 4096 tokens. If your prompt is already using most of these tokens, the response might be cut short due to this limitation. Here's how you can adjust the max_tokens
parameter:from langchain import OpenAI, ConversationChain, LLMChain, PromptTemplate
from langchain.memory import ConversationBufferWindowMemory
template = """..."""
prompt = PromptTemplate(input_variables=["history", "human_input"], template=template)
chatgpt_chain = LLMChain(
llm=OpenAI(temperature=0, max_tokens=1000), # Adjust this value as needed
prompt=prompt,
verbose=True,
memory=ConversationBufferWindowMemory(k=2),
)
Please adjust the max_tokens
value according to your needs, keeping in mind the limitations of the model you're using. You can find more details about this in the issue #7981.
max_tokens
parameter doesn't resolve the issue, you can try splitting your data into smaller chunks that fit within the token limit. You can use a method similar to the get_sub_prompts
method in the LangChain codebase to split your data from the Graph DB into smaller chunks that fit within the token limit. Here is a simplified version of the get_sub_prompts
method that you can use as a starting point:def get_sub_prompts(
self,
params: Dict[str, Any],
prompts: List[str],
stop: Optional[List[str]] = None,
) -> List[List[str]]:
"""Get the sub prompts for llm call."""
if stop is not None:
if "stop" in params:
raise ValueError("`stop` found in both the input and default params.")
params["stop"] = stop
if params["max_tokens"] == -1:
if len(prompts) != 1:
raise ValueError(
"max_tokens set to -1 not supported for multiple inputs."
)
params["max_tokens"] = self.max_tokens_for_prompt(prompts[0])
sub_prompts = [
prompts[i : i + self.batch_size]
for i in range(0, len(prompts), self.batch_size)
]
return sub_prompts
In this method, prompts
is the list of data from the Graph DB and self.batch_size
is the maximum number of tokens that can be sent to OpenAI at once. The method returns a list of sub-prompts, each of which is a list of data that fits within the token limit.
You can adjust the self.batch_size
to ensure that the total number of tokens in each sub-prompt does not exceed the maximum context length of 4097 tokens.
I hope this helps! If you have any further questions, please don't hesitate to ask.
Best, Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
it seems like you're dealing with an issue related to the maximum context length of the model and how to handle it when working with a Graph DB (Database) containing a large number of nodes. The error you're encountering is due to the fact that the total number of tokens being sent to OpenAI is exceeding the model's maximum context length of 4097 tokens. My suggestion are.....
Adjust the max_tokens
Parameter:
You can adjust the max_tokens
parameter in the OpenAI instance of the LLMChain. This parameter determines the maximum number of tokens to generate in the completion. However, the total token count of your prompt and max_tokens
cannot exceed the model's context length (4097 tokens in this case).
Example code:
from langchain import OpenAI, ConversationChain, LLMChain, PromptTemplate
from langchain.memory import ConversationBufferWindowMemory
template = """...""" # Your template here
prompt = PromptTemplate(input_variables=["history", "human_input"], template=template)
chatgpt_chain = LLMChain(
llm=OpenAI(temperature=0, max_tokens=1000), # Adjust this value as needed
prompt=prompt,
verbose=True,
memory=ConversationBufferWindowMemory(k=2)
)
Adjust the max_tokens
value according to your needs, keeping in mind the limitations of the model you're using.
Split Your Data into Smaller Chunks:
If adjusting the max_tokens
parameter doesn't resolve the issue, you can try splitting your data from the Graph DB into smaller chunks that fit within the token limit. This way, you can send multiple smaller requests instead of one large request.
You can use a method similar to the get_sub_prompts
method mentioned in the context to split your data into smaller chunks. This method takes into account the batch_size
to ensure that each sub-prompt fits within the token limit.
Adjust the batch_size
value to control the number of tokens in each sub-prompt. The method returns alist of sub-prompts, each of which contains a subset of your data that fits within the token limit.
By either adjusting the max_tokens
parameter or splitting your data into smaller chunks, you should be able to work around the issue of exceeding the model's maximum context length. Make sure to carefully test your approach to ensure that it works effectively with your specific dataset and use case.
I have encountered the same problem, my knowledge graph is too large. Has anyone solved this problem๏ผ
Same here! My knowledge graph schema itself exceeds the token limit of my deployed model which is 4096 tokens. Can the question be sent in chunks and the answer can be aggregated?
If we are passing the graph
object to the chain, where do you recommend calling the get_sub_prompts
method? Is there any internal functionality in langchain that can do this?
Hi, @leavestyle-coder
I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue you raised is related to encountering the error "This model's maximum context length is 4097 tokens" when working with a Graph DB with many nodes. It has received suggestions from other users, including adjusting the max_tokens
parameter and splitting the data into smaller chunks to fit within the token limit. However, the issue remains unresolved.
Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!
I am still facing the same issue . Did the chunking work ? I am having issue when i am passing in the graph.schema i the llm model definitely exceeding the token limit
Issue you'd like to raise.
How to resolve [This model's maximum context length is 4097 tokens] ?
If Graph DB have many nodes, can not send those tokens to OpenAI by one time.
Suggestion:
No response