Closed katmai closed 1 year ago
That said...
Through trial and error, and as previously mentioned, I also believe the optimal solution lies in segmenting the requests into "chunks," akin to the method employed by the Superpower ChatGPT plugin. I will explain.
With ChatGPT 3.5, a token budget of 4097 is allocated, which can be utilized for either input, output or a combination of both.
The issue arises when Auto-GPT transmits a considerable volume of data, consuming all the allocated tokens, and leaving none for the response. Alternatively, truncating the data sent to ChatGPT results in errors during the response creation and handling.
Therefore, the proposed fix involves identifying the total token count using a tokenizer on the input text, dividing the request into segments or 'chunks,' appending the pre and post-sections, and progressively submitting them until the quota is exhausted. The submission would be divided into 'X' parts, where 'X' is a factor of (4000 - pre/post section token length).
For instance, here's how Superpower ChatGPT effectively implements this strategy:
Act like a document/text loader until you load and remember the content of the next text/s or document/s.
There might be multiple files, each file is marked by name in the format ### DOCUMENT NAME.
I will send them to you in chunks. Each chunk starts will be noted as [START CHUNK x/TOTAL], and the end of this chunk will be noted as [END CHUNK x/TOTAL], where x is the number of current chunks, and TOTAL is the number of all chunks I will send you.
I will split the message in chunks, and send them to you one by one. For each message follow the instructions at the end of the message.
Let's begin:
[START CHUNK 1/2]
... THE CHUNK CONTENT GOES HERE ...
[END CHUNK 1/2]
Reply with OK: [CHUNK x/TOTAL]
Don't reply with anything else!
Superpower ChatGPT on the Google Chrome webstore: https://chrome.google.com/webstore/detail/superpower-chatgpt/amhmeenmapldpjdedekalnfifgnpfnkc
See also: https://github.com/saeedezzati/superpower-chatgpt See also: https://medium.com/@josediazmoreno/break-the-limits-send-large-text-blocks-to-chatgpt-with-ease-6824b86d3270
If anyone is working on a patch, I'd definitely give it a whirl. Not at a point right now (commitment and time wise) to work on one...even with Copilot and ChatGPT as my pair programming buddy! -- Feature Leech
just leaving a +1 here
needs core devs to work on a way to chunk - oh and thanks to them for helping a bunch of us - this is a challenging one as it stops workflow of agents (ie no recovery)
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 5394 tokens. Please reduce the length of the messages.
Same issue :( The tool became totally unusable
Any solution?
Same here. Now trying 0.4, but still get the fatal error "This model's maximum context length is 4097 tokens" each time I try diferent --manual goals or automatic prompt
Same here. Now trying 0.4, but still get the fatal error "This model's maximum context length is 4097 tokens" each time I try diferent --manual goals or automatic prompt
what are the goals that you guys usually give?
I think the issue is that the devs have not integrated tiktoken into the platform, this is why this is happening.
TikToken will basically count the tokens needed to send your request and then we can automatically adjust the max tokens we send openAI so that it does not try to send back a response that would exceed the max token count for your model. Also there should be some left unused to accommodate the small margin of error tik token can produce.
We have developed an AutoGPT UI that we are about to release opensource and we are debating on integrating tiktoken and filing a pull request to bring it into the platform but we dont want to duplicate the effort if the core dev team is already working this.
BRUTAL fix : either substring messages to 4000 lengh, or use OpenAI to do summarize.
for summarizing here is the code which i made in function create_chat_completion in file api_manager.py
def summarise(self, conversation) -> str:
"""
Summarises the conversation history.
:param conversation: The conversation history
:return: The summary
"""
messages = [
{ "role": "assistant", "content": "Summarize this conversation in 2000 characters or less" },
{ "role": "user", "content": str(conversation) }
]
response = openai.ChatCompletion.create(
model=self.config['model'],
messages=messages,
temperature=0.1
)
return response.choices[0]['message']['content']
and in create_chat_completion i made this:
`#fix length
sumlen=0
strmess=""
for mess in messages:
sumlen=sumlen+len(mess.content)
strmess = strmess + " "+mess.content
if sumlen>=4000:
#summarize
summary = self.summarise(strmess)
response = openai.ChatCompletion.create(
deployment_id=deployment_id,
model=model,
messages=summary,
temperature=temperature,
max_tokens=max_tokens,
api_key=cfg.openai_api_key,
)
return response`
HI
I couldn't get this to work, could you paste the full file of your api_manager.py so i can copy/paste?
Right, the core dev have been working on the fix. Could anyone please give a try to test out this PR, #4652?
β οΈ Search for existing issues first β οΈ
Which Operating System are you using?
Docker
Which version of Auto-GPT are you using?
Master (branch)
GPT-3 or GPT-4?
GPT-3.5
Steps to reproduce πΉ
listing the auto_gpt_workspace folder errors out. maybe this is an erroneous bug, not really sure, but why is it calling openai when it's merely listing the files in the folder?
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4819 tokens. Please reduce the length of the messages.
Current behavior π―
listing the folder contents errors out and kills the program if there's too many files in there.
Expected behavior π€
not ... error out :D
Your prompt π
Your Logs π