[Bug] Too many requests, please wait before trying again.

kkportfoliokk commented 2 months ago

📦 Environment

Vercel

📌 Version

v1.17.7

💻 Operating System

macOS

🌐 Browser

Chrome

🐛 Bug Description

Too many requests, please wait before trying again. occurs when using anthorpic.claude-3-opus-20240229-v1:0. Throttling occurs way to fast and I am suspecting that there is a bug in the client side.

{
  "error": {
    "body": {
      "httpStatusCode": 429,
      "requestId": "",
      "attempts": 3,
      "totalRetryDelay": 1280
    },
    "message": "Too many requests, please wait before trying again.",
    "type": "ThrottlingException"
  },
  "provider": "bedrock",
  "region": "us-west-2"
}

It looks like when setting a large value for max_tokens_to_sample, a throttling_exception: Too many requests error occurs. The limit for max_tokens_to_sample is 4096. If a value greater than 4096 is set, a throttling_exception: Too many requests error will occur.

📷 Recurrence Steps

Select anthorpic.claude-3-opus-20240229-v1:0
Send a message
Send another message right after the reply is displayed.

🚦 Expected Behavior

No response

📝 Additional Information

No response

lobehubbot commented 2 months ago

👀 @kkportfoliokk

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事，并尽快回复您。请确保您已经提供了尽可能多的背景信息。

dosubot[bot] commented 2 months ago

Hey @kkportfoliokk! 👋 I'm here to help you with any bugs, questions, or contributions you have. Let's tackle this issue together while we wait for a human maintainer.

To resolve the "Too many requests, please wait before trying again" error when using anthorpic.claude-3-opus-20240229-v1:0 on Vercel, ensure that the max_tokens_to_sample value does not exceed the maxOutput limit of the model, which is 4096 tokens. Setting a value greater than 4096 will trigger the throttling exception.

Here is the relevant configuration for the model:

{
  description:
    'Claude 3 Opus 是 Anthropic 用于处理高度复杂任务的最强大模型。它在性能、智能、流畅性和理解力方面表现卓越。',
  displayName: 'Claude 3 Opus',
  enabled: true,
  functionCall: true,
  id: 'claude-3-opus-20240229',
  maxOutput: 4096,
  pricing: {
    input: 15,
    output: 75,
  },
  releasedAt: '2024-02-29',
  tokens: 200,000,
  vision: true,
}

To fix the issue, set the max_tokens_to_sample value to 4096 or less:

const max_tokens_to_sample = 4096; // or any value <= 4096

This adjustment should prevent the "Too many requests, please wait before trying again" error ^[1].

To continue talking to Dosu, mention @dosu.

lobehub / lobe-chat