Aider-AI / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
20.95k stars 1.94k forks source link

Improved Sonnet 3.5 via Bedrock #2004

Open xcke opened 2 weeks ago

xcke commented 2 weeks ago

Hi,

We have implemented a lot of improvements regarding Sonnet 3.5, however when using it via Bedrock (that is the older Sonner 3.5 version) I keep running into the Token Limit error.

Model bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0 has hit a token limit!
Token counts below are approximate.

Input tokens: ~9,516 of 200,000
Output tokens: ~3,019 of 4,096 -- possibly exceeded output limit!
Total tokens: ~12,535 of 200,000

For more info: https://aider.chat/docs/troubleshooting/token-limits.html

I have tried to use model config:

cat ~/.aider.models.yml
- accepts_images: false
  cache_control: true
  caches_by_default: false
  edit_format: diff
  editor_edit_format: editor-diff
  editor_model_name: bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0
  examples_as_sys_msg: true
  extra_params:
    extra_headers:
      # anthropic-beta: prompt-caching-2024-07-31
      anthropic-version: 2023-06-01
      anthropic-beta: max-tokens-3-5-sonnet-2024-07-15
    max_tokens: 8192
  lazy: true
  name: bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0
  reminder: user
  send_undo_reply: false
  streaming: true
  use_repo_map: true
  use_system_prompt: true
  use_temperature: true
  weak_model_name: openai/gpt-4o-mini

I know AWS is not using the latest model version, so by default it has 4k output tokens.

Is there a good workaround or model configuration option to get ahead of this?

fry69 commented 2 weeks ago

Thank you for filing this issue.

cat ~/.aider.models.yml

The correct filename to override model configurations is .aider.model.settings.yml see here -> https://aider.chat/docs/config/adv-model-settings.html#model-settings

xcke commented 2 weeks ago

I fixed the name of the file, and also updated to include supported configurations.

However, still getting Token limit errors. Even if lowering the max_tokens, the effect is the same:

- cache_control: true
  edit_format: diff
  editor_edit_format: editor-diff
  editor_model_name: bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0
  examples_as_sys_msg: true
  extra_params:
    extra_headers:
      anthropic-beta: prompt-caching-2024-07-31
      # anthropic-beta: max-tokens-3-5-sonnet-2024-07-15
    max_tokens: 4092
  name: bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0
  reminder: user
  use_repo_map: true
  weak_model_name: openai/gpt-4o-mini

Is there a way to add prefill support here?

paul-gauthier commented 2 weeks ago

You need to use the model metadata json file to indicate that it supports prefill.

xcke commented 2 weeks ago

Thanks @paul-gauthier . I think this solved my issue:

{
    "bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0": {
        "max_tokens": 4192,
        "max_input_tokens": 128000,
        "max_output_tokens": 4192,
        "input_cost_per_token": 0.000003,
        "output_cost_per_token": 0.000015,
        "litellm_provider": "bedrock",
        "supports_assistant_prefill": true,
        "supports_prompt_caching": true,
        "mode": "chat"
    }
}