saoudrizwan / claude-dev

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, and more with your permission every step of the way.
https://marketplace.visualstudio.com/items?itemName=saoudrizwan.claude-dev
MIT License
4.04k stars 381 forks source link

Number of request tokens has exceeded #124

Closed LeonMatch closed 1 week ago

LeonMatch commented 3 weeks ago

I am getting this error often during development: "Number of request tokens has exceeded your daily rate limit". How is it possible to reduce request tokens in communication with Claude Dev? Is there caching or something? What am I missing?

Also Claude Dev produces a lot of unnecessary text, like "Task Completed" summary for example, even when the task is not being completed yet. I might be testing last changes in code and may come back with a bug. Is reducing that would help with "Number of request tokens has exceeded" error?

I got the following limits in Anthropic: Claude 3.5 Sonnet | 50 requests per minute ( I am not using that much) | 40,000 tokens per minute | 1,000,000 tokens per day Claude 3 Sonnet | 50 requests per minute | 40,000 tokens per minute | tokens per day 1,000,000 Claude 3 Haiku | 50 requests per minute | 50,000 tokens per minute | tokens per day 5,000,000

It is Claude 3.5 Sonnet I am using in Claude Dev now. If I change it to Claude 3 Haiku, which has greater limits, will it reduce the quality of Claude Dev performance?

ichoosetoaccept commented 3 weeks ago

I came here to open an issue about avoiding this type of message: API Request Failed

429 {"type":"error","error":{"type":"rate_limit_error","message":"Number of request tokens has exceeded your per-minute rate limit (https://docs.anthropic.com/en/api/rate-limits); see the response headers for current usage. Please reduce the prompt length or the maximum tokens requested, or try again later. You may also contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase."}}

It feels like this should be avoidable. Either grab the current user's per-minute (and other types of limits) rate limits and make sure the extension abides by them (perhaps even telling the user something like "waiting to avoid Anthropic rate limiting" or, if Anthropic's API does not support reading out these limits, ask the user to check them manually in their Anthropic account and provide a way to enter them in extension settings. Then ensure the extension respects those limits.

I feel my issue is similar enough to this issue that opening a new one was not warranted.

go-run-jump commented 3 weeks ago

I'd like to add some thoughts to this discussion regarding token usage and rate limits:

  1. Complex feature development often requires sending substantial context to the API repeatedly, as initial solutions may need refinement or pivoting. This can quickly consume tokens.

  2. Anthropic's rate limits for tiers 1-3 are quite restrictive for intensive development work. Even tier 3 (5M tokens/day) can be insufficient for completing moderately complex tasks.

  3. With the introduction of prompt caching, claude dev's approach becomes more feasible and cost-effective. However, further optimizations may be necessary.

Personal experience: I've hit the tier 3 limit (5 million tokens) before completing a relatively straightforward addition to my project. This highlights how quickly these limits can be reached during active development.

Potential solutions and suggestions:

  1. Short-term fix: Upgrading to tier 4 ($400 deposit) provides 50M tokens/day, which should be sufficient for most development needs.

  2. Request to Anthropic: We should collectively ask Anthropic to reconsider how cache hits count against daily limits. There's a strong case for reducing the impact of cached responses on token quotas.

  3. Optimization: We should explore ways to further reduce token usage without compromising functionality.

  4. User controls: Implementing user-configurable settings for rate limit management, as suggested by @ichoosetoaccept, could help users stay within their specific limits.

  5. Transparency: Adding visual indicators for token usage and rate limit status within the extension would help developers manage their consumption more effectively.

These steps could significantly improve the development experience while working within Anthropic's constraints.

shimr0d commented 3 weeks ago

Thanks for these approaches!

  1. Short-term fix: Upgrading to tier 4 ($400 deposit) provides 50M tokens/day, which should be sufficient for most development needs.
  2. Request to Anthropic: We should collectively ask Anthropic to reconsider how cache hits count against daily limits. There's a strong case for reducing the impact of cached responses on token quotas.

1 - I changed my Tier to Build Tier 4 ($400+ deposit) yesterday as I reached the limit most frequently this last few days. I guess there's a delay for it to be effective... not very clear from their info. 2 - Nice idea count me in if I can help ;)

Screenshot 2024-08-19 at 06 18 54
go-run-jump commented 3 weeks ago

@shimr0d There is a waiting period of 14 days from the point in time where you made your first deposit. This could be happening to you, otherwise it might be pure luck. For me, the upgrade from Tier 3 to Tier 4 was almost instant.

danividalg commented 2 weeks ago

Hi @saoudrizwan :) It could be great if you can add a configuration field for the rate limits, so Claude Dev can wait if needed to avoid overpass this limit and get this type of errors. What do you think ? :)

saoudrizwan commented 1 week ago

I suggest using OpenRouter to get around these rate limit issues--they are working on adding prompt caching which would make them the best api option to use. Adding rate limit information is on the roadmap https://github.com/saoudrizwan/claude-dev/issues/186