Significant-Gravitas / AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
https://agpt.co
Other
168.08k stars 44.35k forks source link

Improve chunking and chunk handling #38

Closed cometbus closed 1 year ago

cometbus commented 1 year ago

AutoGPT keeps exceeding the token limit by like 100 tokens every time I use it before it finishes it's task and can't handle the error Traceback (most recent call last): File "/root/Auto-GPT/scripts/main.py", line 199, in assistant_reply = chat.chat_with_ai( File "/root/Auto-GPT/scripts/chat.py", line 64, in chat_with_ai response = openai.ChatCompletion.create( File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_resources/chat_completion.py", line 25, in create return super().create(*args, **kwargs) File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_resources/abstract/engine_apiresource.py", line 153, in create response, , api_key = requestor.request( File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request resp, got_stream = self._interpret_response(result, stream) File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response self._interpret_response_line( File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: This model's maximum context length is 8192 tokens. However, your messages resulted in 8215 tokens. Please reduce the length of the messages.

NEXT ACTION: COMMAND = Error: ARGUMENTS = Invalid JSON SYSTEM: Command Error: returned: Unknown command Error: Traceback (most recent call last): File "/root/Auto-GPT/scripts/main.py", line 199, in assistant_reply = chat.chat_with_ai( File "/root/Auto-GPT/scripts/chat.py", line 64, in chat_with_ai response = openai.ChatCompletion.create( File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_resources/chat_completion.py", line 25, in create return super().create(*args, **kwargs) File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_resources/abstract/engine_apiresource.py", line 153, in create response, , api_key = requestor.request( File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request resp, got_stream = self._interpret_response(result, stream) File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response self._interpret_response_line( File "/root/Auto-GPT/scripts/myenv/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: This model's maximum context length is 8192 tokens. However, your messages resulted in 8215 tokens. Please reduce the length of the messages. (myenv) (base) root@DESKTOP-S70O6TN:~/Auto-GPT/scripts#

jfryton commented 1 year ago

Same here:

raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 8192 tokens. However, your messages resulted in 8272 tokens. Please reduce the length of the messages.
ResourceHog commented 1 year ago

I also get this error. Not sure what to do about it. It's possible to count tokens using tiktoken, but, what then? Just truncate the prompt?

rmn1978 commented 1 year ago

I have the same error. Don't know how to solve...

yousefissa commented 1 year ago

Try setting your token_limit to a lower value

Torantulino commented 1 year ago

Working on a fix for this right this second, hang tight.

rmn1978 commented 1 year ago

Try setting your token_limit to a lower value

what file to modify?

PhilipAD commented 1 year ago

@Torantulino The problem stems from split_text in browse.py, you should be calculating token length instead of character length using something like

from transformers import GPT2TokenizerFast
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

def split_text(text, max_length=2000):
    tokenized_text = tokenizer(text)['input_ids']

then you can guarantee that you wont go over.

giovannimanzoni commented 1 year ago

I tried to increase token limit with this in .env file: FAST_TOKEN_LIMIT=16000 SMART_TOKEN_LIMIT=32000

but the chat crash with: openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 15997 tokens (960 in the messages, 15037 in the completion). Please reduce the length of the messages or completion.

How to allow more token ? What I have to do on openai website ? How to change openai model to use ?

giovannimanzoni commented 1 year ago

I tried to increase token limit with this in .env file: FAST_TOKEN_LIMIT=16000 SMART_TOKEN_LIMIT=32000

but the chat crash with: openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 15997 tokens (960 in the messages, 15037 in the completion). Please reduce the length of the messages or completion.

How to allow more token ? What I have to do on openai website ? How to change openai model to use ?

Ok, I found https://platform.openai.com/docs/models/gpt-3-5 I upgrade gpt-4 to gpt-4-32k . gpt-3.5-turbo cannot be upgraded because lack in this source code.

giovannimanzoni commented 1 year ago

I tried to increase token limit with this in .env file: FAST_TOKEN_LIMIT=16000 SMART_TOKEN_LIMIT=32000 but the chat crash with: openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 15997 tokens (960 in the messages, 15037 in the completion). Please reduce the length of the messages or completion. How to allow more token ? What I have to do on openai website ? How to change openai model to use ?

Ok, I found https://platform.openai.com/docs/models/gpt-3-5 I upgrade gpt-4 to gpt-4-32k . gpt-3.5-turbo cannot be upgraded because lack in this source code.

but not gpt-4-32k not work: SYSTEM: Command evaluate_code returned: Error: The model: gpt-4-32k does not exist

how to use it ??

james431987 commented 1 year ago

mannnn

NEXT ACTION: COMMAND = message_agent ARGUMENTS = {'key': 18, 'message': "Can you help us develop the character's romance subplot in a way that ties in with the established themes of the story? Please limit your response to 2000 characters or less. Additionally, can you provide us with some ideas for the love interests and their motivations?"} Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for authorGPT... Input:y -=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-= SYSTEM: Command message_agent returned: Error: This model's maximum context length is 4097 tokens. However, your messages resulted in 4563 tokens.

NEXT ACTION: COMMAND = message_agent ARGUMENTS = {'key': 18, 'message': "Can you help us develop the character's romance subplot while making sure that the actions and motivations of the character tie in with the established themes of the story? Can you also provide us with some ideas for the love interests? Please summarize your response to ensure we understand your ideas clearly. Thank you!"} Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for authorGPT... Input:y -=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-= SYSTEM: Command message_agent returned: Error: This model's maximum context length is 4097 tokens. However, your messages resulted in 4627 tokens.

Livadas commented 1 year ago

Input:Request failed with status code: 401 Response content: b'{"detail":{"status":"quota_exceeded","message":"This request exceeds your quota. You have 68 characters remaining, while 92 characters are required for this request.","character_used":10024,"character_limit":10000}}' how to adjust character_limit? SYSTEM: Human feedback: how to adjust character_limit? Warning: Failed to parse AI output, attempting to fix. If you see this warning frequently, it's likely that your prompt is confusing the AI. Try changing it up slightly. Failed to fix AI output, telling the AI. Error: Invalid JSON The character limit can be adjusted in the OpenAI API settings when creating an API key. You can set the maximum length of tokens (words) generated by the API in a single response. Keep in mind that a shorter character limit may result in more concise or lower quality responses, while a longer character limit may result in more verbose or higher quality responses. AUTO JOB HUNTER THOUGHTS: REASONING: CRITICISM: Warning: Failed to parse AI output, attempting to fix. If you see this warning frequently, it's likely that your prompt is confusing the AI. Try changing it up slightly. Failed to fix AI output, telling the AI. NEXT ACTION: COMMAND = Error: ARGUMENTS = Missing 'command' object in JSON Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for Auto Job Hunter... Input:truncate responses google commands to be less than character limit SYSTEM: Human feedback: truncate responses google commands to be less than character limit Warning: Failed to parse AI output, attempting to fix. If you see this warning frequently, it's likely that your prompt is confusing the AI. Try changing it up slightly. AUTO JOB HUNTER THOUGHTS: thought REASONING: reasoning PLAN:

Livadas commented 1 year ago

^ read more about this and it appears to be tokens for ElevenLabs (https://github.com/Torantulino/Auto-GPT/issues/263)

BASS10 commented 1 year ago

I asked Auto-GPT to help with this problem and it found out that the correct model to use for a larger context length is gpt-4-13b

SMART_LLM_MODEL=gpt-4-13b

It seems to be working! More expensive though.

I am also using SMART_TOKEN_LIMIT=25000 for now while the token length estimation is off.

BASS10 commented 1 year ago

I asked Auto-GPT to help with this problem and it found out that the correct model to use for a larger context length is gpt-4-13b

SMART_LLM_MODEL=gpt-4-13b

It seems to be working! More expensive though.

I am also using SMART_TOKEN_LIMIT=25000 for now while the token length estimation is off.

Might have spoken too soon, I am now seeing this error... SYSTEM: Command evaluate_code returned: Error: The model gpt-4-13b does not exist

Does this mean it has been using the "fast" model (gpt-3.5) up to this point? Or is this error specific to "Command evaluate_code"?

IronCond0r commented 1 year ago

@Torantulino The problem stems from split_text in browse.py, you should be calculating token length instead of character length using something like

from transformers import GPT2TokenizerFast
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

def split_text(text, max_length=2000):
    tokenized_text = tokenizer(text)['input_ids']

then you can guarantee that you wont go over.

does this work? if so, where do I put that code, and what does it replace?

rodrix385 commented 1 year ago

Same issue. Any fix?

nicostubi commented 1 year ago

I tried to increase token limit with this in .env file: FAST_TOKEN_LIMIT=16000 SMART_TOKEN_LIMIT=32000 but the chat crash with: openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 15997 tokens (960 in the messages, 15037 in the completion). Please reduce the length of the messages or completion. How to allow more token ? What I have to do on openai website ? How to change openai model to use ?

Ok, I found https://platform.openai.com/docs/models/gpt-3-5 I upgrade gpt-4 to gpt-4-32k . gpt-3.5-turbo cannot be upgraded because lack in this source code.

but not gpt-4-32k not work: SYSTEM: Command evaluate_code returned: Error: The model: gpt-4-32k does not exist

how to use it ??

@giovannimanzoni I saw the same yesterday when trying to use this model with Postman. I guess it's because of the license. I guess you don't have access to GPTPlus/GPT-4, which is the case for me as well. I will get my GPTPlus licence on Monday, I could try and keep you posted.

giovannimanzoni commented 1 year ago

I tried to increase token limit with this in .env file: FAST_TOKEN_LIMIT=16000 SMART_TOKEN_LIMIT=32000 but the chat crash with: openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 15997 tokens (960 in the messages, 15037 in the completion). Please reduce the length of the messages or completion. How to allow more token ? What I have to do on openai website ? How to change openai model to use ?

Ok, I found https://platform.openai.com/docs/models/gpt-3-5 I upgrade gpt-4 to gpt-4-32k . gpt-3.5-turbo cannot be upgraded because lack in this source code.

but not gpt-4-32k not work: SYSTEM: Command evaluate_code returned: Error: The model: gpt-4-32k does not exist

how to use it ??

@giovannimanzoni I saw the same yesterday when trying to use this model with Postman. I guess it's because of the license. I guess you don't have access to GPTPlus/GPT-4, which is the case for me as well. I will get my GPTPlus licence on Monday, I could try and keep you posted.

Yes, that's right. We need v4 API . I'm waiting my license.

Pwuts commented 1 year ago

Should be fixed in #2542 just now.

bobinson commented 1 year ago

Should be fixed in #2542 just now.

Unfortunately reproduced in https://github.com/Significant-Gravitas/Auto-GPT/issues/2906

Pwuts commented 1 year ago

Repurposing this issue as a collection issue for the following:

BlueTeamByDay commented 1 year ago

still facing the same issue

remriel commented 1 year ago
Continuous Mode:  ENABLED
WARNING:  Continuous mode is not recommended. It is potentially dangerous and may cause your AI to run forever or carry out actions you would not usually authorise. Use at your own risk.
GPT3.5 Only Mode:  ENABLED
WARNING: Plugin SystemInformationPlugin found. But not in the allowlist... Load? (y/n):
WARNING: Plugin AutoGPT_YouTube found. But not in the allowlist... Load? (y/n):
Welcome back!  Would you like me to return to being EarningsReportGPT?
Continue with the last settings?
Name:  EarningsReportGPT
Role:  an AI assistant that specializes in providing detailed and highly readable summaries of earnings reports, along with a sentiment rating, to help businesses make informed decisions.
Goals: ['Retrieve the latest earnings report using the API keys provided in APIkeys.txt.', 'Analyze the report and provide a detailed summary that highlights the most important information in a highly readable format.', 'Assign a sentiment rating from 1 to 10 based on the overall tone of the report, with 1 being very negative and 10 being very positive.', 'Ensure that the summary and sentiment rating are delivered in a timely manner to allow for quick decision-making.', 'Continuously learn and improve its analysis and reporting capabilities to provide even more valuable insights to its users.']
API Budget: infinite
Continue (y/n):
EarningsReportGPT  has been created with the following details:
Name:  EarningsReportGPT
Role:  an AI assistant that specializes in providing detailed and highly readable summaries of earnings reports, along with a sentiment rating, to help businesses make informed decisions.
Goals:
-  Retrieve the latest earnings report using the API keys provided in APIkeys.txt.
-  Analyze the report and provide a detailed summary that highlights the most important information in a highly readable format.
-  Assign a sentiment rating from 1 to 10 based on the overall tone of the report, with 1 being very negative and 10 being very positive.
-  Ensure that the summary and sentiment rating are delivered in a timely manner to allow for quick decision-making.
-  Continuously learn and improve its analysis and reporting capabilities to provide even more valuable insights to its users.
Using memory of type:  PineconeMemory
Using Browser:  chrome
 THOUGHTS:  I need to retrieve the latest earnings report using the API keys provided in APIkeys.txt. I will start by reading the API keys from the file and then use them to make a request to the API endpoint.
REASONING:  I need to retrieve the latest earnings report to analyze and provide a summary with a sentiment rating. This will help businesses make informed decisions.
PLAN:
-  Read API keys from APIkeys.txt
-  Make request to API endpoint using the keys
-  Retrieve the latest earnings report
-  Analyze the report and provide a summary with a sentiment rating
-  Deliver the summary and sentiment rating in a timely manner
CRITICISM:  I need to ensure that I am using the correct API keys and that the request to the API endpoint is successful. I also need to make sure that the summary and sentiment rating are accurate and useful to the user.
NEXT ACTION:  COMMAND = read_file ARGUMENTS = {'filename': 'D:\\GPT\\Autogpt\\autogpt\\auto_gpt_workspace\\APIkeys.txt'}
SYSTEM:  Command read_file returned: 
 THOUGHTS:  Now that I have the API keys, I will use them to make a request to the API endpoint and retrieve the latest earnings report.
REASONING:  I need to retrieve the latest earnings report to analyze and provide a summary with a sentiment rating. This will help businesses make informed decisions.
PLAN:
-  Make request to API endpoint using the keys
-  Retrieve the latest earnings report
-  Analyze the report and provide a summary with a sentiment rating
-  Deliver the summary and sentiment rating in a timely manner
CRITICISM:  I need to ensure that the request to the API endpoint is successful and that I am retrieving the correct information. I also need to make sure that the summary and sentiment rating are accurate and useful to the user.
NEXT ACTION:  COMMAND = execute_shell ARGUMENTS = {'command_line': 'curl -H "Authorization: Bearer " -H "Content-Type: application/json" "https://newsapi.org/v2/everything?q=earnings%20report&sortBy=publishedAt&apiKey="'}
Executing command 'curl -H "Authorization: Bearer " -H "Content-Type: application/json" "https://newsapi.org/v2/everything?q=earnings%20report&sortBy=publishedAt&apiKey="' in working directory 'D:\GPT\Autogpt\autogpt\auto_gpt_workspace'
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "D:\GPT\Autogpt\autogpt\__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "C:\Python311\Lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\click\core.py", line 1635, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\GPT\Autogpt\autogpt\cli.py", line 90, in main
    run_auto_gpt(
  File "D:\GPT\Autogpt\autogpt\main.py", line 154, in run_auto_gpt
    agent.start_interaction_loop()
  File "D:\GPT\Autogpt\autogpt\agent\agent.py", line 242, in start_interaction_loop
    self.memory.add(memory_to_add)
  File "D:\GPT\Autogpt\autogpt\memory\pinecone.py", line 47, in add
    vector = get_ada_embedding(data)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\GPT\Autogpt\autogpt\llm_utils.py", line 231, in get_ada_embedding
    embedding = create_embedding(text, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\GPT\Autogpt\autogpt\llm_utils.py", line 46, in _wrapped
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\GPT\Autogpt\autogpt\llm_utils.py", line 257, in create_embedding
    return openai.Embedding.create(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\openai\api_resources\embedding.py", line 33, in create
    response = super().create(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
                           ^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\openai\api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\openai\api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "C:\Python311\Lib\site-packages\openai\api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
**openai.error.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 26735 tokens (26735 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.**
erberna commented 1 year ago

import nltk nltk.download('punkt')

def process_text(text):

Dividir el texto en oraciones usando la librería nltk

sentences = nltk.sent_tokenize(text)

# Inicializar una lista vacía para almacenar los resultados
results = []

# Procesar cada oración por separado
for sentence in sentences:
    # Enviar cada oración al modelo de IA para su procesamiento
    embedding = create_embedding_with_ada(sentence)

    # Agregar el resultado a la lista de resultados
    results.append(embedding)

# Unir los resultados en una única salida
output = ''.join(results)

return output

Código original con la función procesar_texto() añadida

def run_auto_gpt():

...

agent.start_interaction_loop()
# ...

def main(): run_auto_gpt()

if name == 'main': main()

remriel commented 1 year ago

Still occurring with latest branch Full log

https://gist.githubusercontent.com/remriel/ea20ffcdc1afeb9876fe1c7d7bc7458d/raw/f7dd1a269aa12e85fbf3a3997b247d529ca3f369/errorlog.log

Indranil-Mondal commented 1 year ago

Same error, any fixes ?

File "/usr/local/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 5063 tokens. Please reduce the length of the messages.

Inc-line commented 1 year ago

It looks like we might need to tokenize each row of the text and save the number of tokens to a new column in a DataFrame with the tokenizer.encode() method from OpenAI''s TikToken library.

Or, just upgrade your OpenAi Plan....

I retrieved the technical answer here: https://www.mlq.ai/gpt-4-pinecone-website-ai-assistant/

Question: Should we edit the split_text in browse.py to calculate token length instead of character length, like @PhilipAD mentioned above?

GoZippy commented 1 year ago

Working on a fix for this right this second, hang tight.

Any update? Need better error handling big time.

Pwuts commented 1 year ago

This error should be mitigated by #3646 (a patch rather than a fix). Currently I'm working on improving and fixing the entire memory system which should fix all of these errors.

EmpathicSage commented 1 year ago

Quite a few other issues have been opened on this topic:

Also, please consider these relevant pull requests:

Furthermore, since PRs are on hold and the core contributors are rearchitecting Auto-GPT, these issues may auto-resolve with the new release.

See also: https://github.com/Significant-Gravitas/Nexus/wiki/Architecting

EncryptShawn commented 1 year ago

I think the issue is that the devs have not integrated tiktoken into the platform, this is why this is happening.

TikToken will basically count the tokens needed to send your request and then we can automatically adjust the max tokens we send openAI so that it does not try to send back a response that would exceed the max token count for your model. Also there should be some left unused to accommodate the small margin of error tik token can produce.

We have developed an AutoGPT UI that we are about to release opensource and we are debating on integrating tiktoken and filing a pull request to bring it into the platform but we dont want to duplicate the effort if the core dev team is already working this.

Pwuts commented 1 year ago

I think the issue is that the devs have not integrated tiktoken into the platform, this is why this is happening.

Actually, we have: we use tiktoken to count tokens, chunk and truncate the input, and limit the LLM's output in API calls.

I've been trying to debug the latest reports of this issue, because it should have been fixed in #4652 but some users are still having issues. Unfortunately, nobody has been able to provide us with a full log or other debug info which we need to figure this out, because we can't reproduce it.

github-actions[bot] commented 1 year ago

This issue was closed automatically because it has been stale for 10 days with no activity.