Using o1-mini (and o1-preview?) from GitHub Models: Uncaught exception tokens limit reached

Jasmin68k commented 3 weeks ago

Issue

Hello!

Using o1-mini (couldn't try o1-preview atm due to rate limits, I'd assume it would suffer from the same issue) from https://github.com/marketplace/models with GITHUB_API_KEY, Aider throws an uncaught exception `tokens_limit_reached' for >4k tokens, although the longer response >4k tokens came through fully before.

The actual limits should be much higher:

https://github.com/marketplace/models/azure-openai/o1-mini https://github.com/marketplace/models/azure-openai/o1-preview

See the log for more details (unedited, except placeholders for prompt/response):

aider --model=github/o1-mini
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Warning for github/o1-mini: Unknown context window size and costs, using sane defaults.
You can skip this check with --no-show-model-warnings

https://aider.chat/docs/llms/warnings.html
Open documentation url for more info? (Y)es/(N)o [Yes]: n                                                             

Warning: Streaming is not supported by github/o1-mini. Disabling streaming.
Aider v0.62.1
Model: github/o1-mini with whole edit format
Git repo: .git with 358 files
Repo-map: disabled
Use /help <question> for help, run "aider --help" to see cmd line args
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> [...PROMPT...]

Add README.md to the chat? (Y)es/(N)o/(D)on't ask again [Yes]:                                                        

[...RESPONSE...]

Tokens: 2.7k sent, 7.4k received.
Applied edit to README.md

# Uncaught APIError in exception_mapping_utils.py line 404

Aider version: 0.62.1
Python version: 3.12.7
Platform: Linux-6.11.5-amd64-x86_64-with-glibc2.40
Python implementation: CPython
Virtual environment: Yes
OS: Linux 6.11.5-amd64 (64bit)
Git version: git version 2.45.2

An uncaught exception occurred:

Traceback (most recent call last):
  File "openai.py", line 854, in completion
    raise e
  File "openai.py", line 790, in completion
    self.make_sync_openai_chat_completion_request(
  File "openai.py", line 649, in make_sync_openai_chat_completion_request
    raise e
  File "openai.py", line 631, in make_sync_openai_chat_completion_request
    raw_response = openai_client.chat.completions.with_raw_response.create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "_legacy_response.py", line 356, in wrapped
    return cast(LegacyAPIResponse[R], func(*args, **kwargs))
                                      ^^^^^^^^^^^^^^^^^^^^^
  File "_utils.py", line 274, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "completions.py", line 815, in create
    return self._post(
           ^^^^^^^^^^^
  File "_base_client.py", line 1277, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "_base_client.py", line 954, in request
    return self._request(
           ^^^^^^^^^^^^^^
  File "_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.APIStatusError: Error code: 413 - {'error': {'code': 'tokens_limit_reached', 'message': 'Request body too large for o1-mini model. Max size: 4000 tokens.', 'details': 'Request body too large for o1-mini model. Max size: 4000 tokens.'}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 1587, in completion
    raise e
  File "main.py", line 1540, in completion
    response = openai_o1_chat_completions.completion(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "o1_handler.py", line 58, in completion
    response = super().completion(
               ^^^^^^^^^^^^^^^^^^^
  File "openai.py", line 864, in completion
    raise OpenAIError(
litellm.llms.OpenAI.openai.OpenAIError: Error code: 413 - {'error': {'code': 'tokens_limit_reached', 'message': 'Request body too large for o1-mini model. Max size: 4000 tokens.', 'details': 'Request body too large for o1-mini model. Max size: 4000 tokens.'}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "aider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "main.py", line 827, in main
    coder.run()
  File "base_coder.py", line 738, in run
    self.run_one(user_message, preproc)
  File "base_coder.py", line 781, in run_one
    list(self.send_message(message))
  File "base_coder.py", line 1278, in send_message
    saved_message = self.auto_commit(edited)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "base_coder.py", line 1979, in auto_commit
    res = self.repo.commit(fnames=edited, context=context, aider_edits=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "repo.py", line 110, in commit
    commit_message = self.get_commit_message(diffs, context)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "repo.py", line 195, in get_commit_message
    commit_message = simple_send_with_retries(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sendchat.py", line 118, in simple_send_with_retries
    _hash, response = send_completion(**kwargs)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sendchat.py", line 98, in send_completion
    res = litellm.completion(**kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "utils.py", line 1013, in wrapper
    raise e
  File "utils.py", line 903, in wrapper
    result = original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "main.py", line 2999, in completion
    raise exception_type(
          ^^^^^^^^^^^^^^^
  File "exception_mapping_utils.py", line 2116, in exception_type
    raise e
  File "exception_mapping_utils.py", line 404, in exception_type
    raise APIError(
litellm.exceptions.APIError: litellm.APIError: APIError: GithubException - Error code: 413 - {'error': {'code': 'tokens_limit_reached', 'message': 'Request body too large for o1-mini model. Max size: 4000 tokens.', 'details': 'Request body too large for o1-mini model. Max size: 4000 tokens.'}}

Version and model info

No response

paul-gauthier commented 2 weeks ago

Thanks for trying aider and filing this issue.

Request body too large for o1-mini model. Max size: 4000 tokens

You need to add fewer files to the chat. You're hitting a token limit on the github api.

Jasmin68k commented 2 weeks ago

Thank you!

My issue's main concern was about Aider throwing an exception rather than handling the error gracefully.

Also I was wondering, whether the actual cause of the issue might be on Aider's site in some way, since the documented token limit is much higher and >4k tokens actually came through fully in the response.

Jasmin68k commented 2 weeks ago

Got the same error with github/gpt-4o now at 8k tokens, which also should have a higher limit https://github.com/marketplace/models/azure-openai/gpt-4o:

# Uncaught APIError in exception_mapping_utils.py line 404

Aider version: 0.62.1
Python version: 3.12.7
Platform: Linux-6.11.7-amd64-x86_64-with-glibc2.40
Python implementation: CPython
Virtual environment: Yes
OS: Linux 6.11.7-amd64 (64bit)
Git version: git version 2.45.2

An uncaught exception occurred:

Traceback (most recent call last):
  File "openai.py", line 854, in completion
    raise e
  File "openai.py", line 790, in completion
    self.make_sync_openai_chat_completion_request(
  File "openai.py", line 649, in make_sync_openai_chat_completion_request
    raise e
  File "openai.py", line 631, in make_sync_openai_chat_completion_request
    raw_response = openai_client.chat.completions.with_raw_response.create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "_legacy_response.py", line 356, in wrapped
    return cast(LegacyAPIResponse[R], func(*args, **kwargs))
                                      ^^^^^^^^^^^^^^^^^^^^^
  File "_utils.py", line 274, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "completions.py", line 815, in create
    return self._post(
           ^^^^^^^^^^^
  File "_base_client.py", line 1277, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "_base_client.py", line 954, in request
    return self._request(
           ^^^^^^^^^^^^^^
  File "_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.APIStatusError: Error code: 413 - {'error': {'code': 'tokens_limit_reached', 'message': 'Request body too large for gpt-4o model. Max size: 8000 tokens.', 'details': 'Request body too large for gpt-4o model. Max size: 8000 tokens.'}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 1587, in completion
    raise e
  File "main.py", line 1560, in completion
    response = openai_chat_completions.completion(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "openai.py", line 864, in completion
    raise OpenAIError(
litellm.llms.OpenAI.openai.OpenAIError: Error code: 413 - {'error': {'code': 'tokens_limit_reached', 'message': 'Request body too large for gpt-4o model. Max size: 8000 tokens.', 'details': 'Request body too large for gpt-4o model. Max size: 8000 tokens.'}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "aider", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "main.py", line 827, in main
    coder.run()
  File "base_coder.py", line 738, in run
    self.run_one(user_message, preproc)
  File "base_coder.py", line 781, in run_one
    list(self.send_message(message))
  File "base_coder.py", line 1278, in send_message
    saved_message = self.auto_commit(edited)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "base_coder.py", line 1979, in auto_commit
    res = self.repo.commit(fnames=edited, context=context, aider_edits=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "repo.py", line 110, in commit
    commit_message = self.get_commit_message(diffs, context)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "repo.py", line 195, in get_commit_message
    commit_message = simple_send_with_retries(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sendchat.py", line 118, in simple_send_with_retries
    _hash, response = send_completion(**kwargs)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sendchat.py", line 98, in send_completion
    res = litellm.completion(**kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "utils.py", line 1013, in wrapper
    raise e
  File "utils.py", line 903, in wrapper
    result = original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "main.py", line 2999, in completion
    raise exception_type(
          ^^^^^^^^^^^^^^^
  File "exception_mapping_utils.py", line 2116, in exception_type
    raise e
  File "exception_mapping_utils.py", line 404, in exception_type
    raise APIError(
litellm.exceptions.APIError: litellm.APIError: APIError: GithubException - Error code: 413 - {'error': {'code': 'tokens_limit_reached', 'message': 'Request body too large for gpt-4o model. Max size: 8000 tokens.', 'details': 'Request body too large for gpt-4o model. Max size: 8000 tokens.'}}

Is GitHub's documentation regarding token limits just wrong? Is it something about GitHub models missing in https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json? Or some bug in Aider?

In any case, it would be nice to handle this error in a more graceful manner.

paul-gauthier commented 1 day ago

I'm labeling this issue as stale because it has been open for 2 weeks with no activity. If there are no additional comments, I will close it in 7 days.Note: A bot script made these updates to the issue.

Aider-AI / aider

Using o1-mini (and o1-preview?) from GitHub Models: Uncaught exception tokens limit reached #2296

Issue

Version and model info