All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More
https://all-hands.dev
MIT License
37.54k stars 4.25k forks source link

[Bug]: Agent cannot handle exception from exceeding token limit #1299

Closed andreas-tornqvist closed 7 months ago

andreas-tornqvist commented 7 months ago

Is there an existing issue for the same bug?

Describe the bug

When a prompt grows too large for the token limit of the API I am using, the job crashes. All the subsequent steps will also fail and it's entering a death loop. This usually happens after something else has happened, such as when -y flag is missing in apt-get command causing it to be cancelled after user input being hanging.

==============
STEP 67

06:07:16 - opendevin:INFO: agent_controller.py:89
PLAN
I want a simple HTML/Javascript/CSS/Canvas application, with a date picker. Every time I pick a date,
I want to know what week number the date has, according to the Swedish calendar.
Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/litellm/llms/azure.py", line 283, in completion
    response = azure_client.chat.completions.create(**data, timeout=timeout)  # type: ignore
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/openai/_utils/_utils.py", line 275, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 667, in create
    return self._post(
           ^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1233, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 922, in request
    return self._request(
           ^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1013, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens. However, your messages resulted in 9689 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.12/site-packages/litellm/main.py", line 836, in completion
    response = azure_chat_completions.completion(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/litellm/llms/azure.py", line 305, in completion
    raise AzureOpenAIError(status_code=e.status_code, message=str(e))
litellm.llms.azure.AzureOpenAIError: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens. However, your messages resulted in 9689 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/agenthub/monologue_agent/utils/monologue.py", line 73, in condense
    resp = llm.completion(messages=messages)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/tenacity/__init__.py", line 314, in iter
    return fut.result()
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/app/.venv/lib/python3.12/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/app/opendevin/llm/llm.py", line 48, in wrapper
    resp = completion_unwrapped(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/litellm/utils.py", line 2947, in wrapper
    raise e
  File "/app/.venv/lib/python3.12/site-packages/litellm/utils.py", line 2845, in wrapper
    result = original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/litellm/main.py", line 2127, in completion
    raise exception_type(
          ^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/litellm/utils.py", line 8573, in exception_type
    raise e
  File "/app/.venv/lib/python3.12/site-packages/litellm/utils.py", line 8415, in exception_type
    raise ContextWindowExceededError(
litellm.exceptions.ContextWindowExceededError: AzureException - Error code: 400 - {'error': {'message': "This model's maximum context 
length is 8192 tokens. However, your messages resulted in 9689 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
06:07:17 - opendevin:ERROR: agent_controller.py:110 - Error condensing thoughts: AzureException - Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens. However, your messages resulted in 9689 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
06:07:17 - opendevin:INFO: agent_controller.py:160
OBSERVATION
Error condensing thoughts: AzureException - Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens. However, your messages resulted in 9689 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

==============
STEP 68

06:07:17 - opendevin:INFO: agent_controller.py:89
PLAN
I want a simple HTML/Javascript/CSS/Canvas application, with a date picker. Every time I pick a date,
I want to know what week number the date has, according to the Swedish calendar.
Traceback (most recent call last):

[...]

Current Version

I'm running docker with tag 0.3.1

Installation and Configuration

I'm on Windows 10, running a bat script as

docker run -v <MyUserPath>\AppData\Local\Programs\Python\Python312\Lib\site-packages\certifi\:/app/.venv/lib/python3.12/site-packages/certifi --env-file .\config.toml -v .\config.toml -e WORKSPACE_MOUNT_PATH=<WORKSPACE> -v <WORKSPACE>:/opt/workspace_base -v /var/run/docker.sock:/var/run/docker.sock -p 3000:3000 --add-host host.docker.internal=host-gateway ghcr.io/opendevin/opendevin:0.3.1

I'm behind a company firewall, that's why I'm mounting the ssl certificates for python.

config.toml file

# This is a template. Run `cp config.toml.template config.toml` to use it.

LLM_BASE_URL=https://<resource>.openai.azure.com
LLM_API_KEY=<KEY>
LLM_MODEL=azure/sandbox-test1-gpt-35-turbo
LLM_DEPLOYMENT_NAME=sandbox-test1-gpt-35-turbo
WORKSPACE_DIR=<WORKSPACE>
AZURE_API_VERSION=2024-02-01

Model and Agent

Reproduction Steps

No response

Logs, Errors, Screenshots, and Additional Context

No response

enyst commented 7 months ago

Thanks for the report. I think the death loop from ContextWindowExceededError has been fixed in main recently, it must have been after 0.3.1 though. The fix should stop the death loop, it doesn't handle it smartly.

Opendevin has some hardcoded limit at which we try to summarize the prompt to preempt this kind of error, but we do that only with a request to the llm, so if it already happened to be higher than the context limits, that will currently fail.

andreas-tornqvist commented 7 months ago

Great to hear that the death loop is stopped. Thanks for the information.

SmartManoj commented 7 months ago

https://github.com/OpenDevin/OpenDevin/blob/960f17a565081d8ce65a443f857c4299eb864a14/agenthub/monologue_agent/agent.py#L32 Set to 8000