plandex-ai / plandex

AI driven development in your terminal. Designed for large, real-world tasks.
https://plandex.ai
GNU Affero General Public License v3.0
10.65k stars 740 forks source link

Rate Limit being hit consistently #123

Closed cfieandres closed 4 months ago

cfieandres commented 4 months ago

After updating to 1.0.0, plandex has consistently hit openai's rate limit, which makes it so that I need to do "plandex c" frequently. Is there a way to have plandex wait till not rate limited and continue by itself?

🚨 Server error → Error starting reply stream → Error, status code → 429, message → Rate limit reached for gpt-4o in organization xxxxxxxxxx on tokens per min (TPM) → Limit 30000, Used 23193, Requested 15447. Please try again in 17.28s. Visit https://platform.openai.com/account/rate-limits to learn more.

danenania commented 4 months ago

Parsing the rate limit error and waiting accordingly is a good idea. I'll look into it.

benrosenblum commented 4 months ago

Worth noting that after you have spent $50 total on your account, the limit increases to 450k tpm.

atljoseph commented 4 months ago

Same here. How to tell amount of usage by model in the set? Checked OpenAI usage and nothing alarming there.

atljoseph commented 4 months ago

if you are running it locally, you can try changing the retry waitBackoff func to apply 5 second additional for each numRetry.

I got a lot less of them that way. For heavier things, might could use 10 sec.

Lastly, this might be a great use case for balancing between anthropic and OpenAI “agents”.

danenania commented 4 months ago

@atljoseph I decreased the backoff a bit in the last release so I may need to revert that or make it configurable. Or just parse the error message and wait accordingly as @cfieandres suggested. My token limit is quite high from building/testing Plandex so I'm not getting any of these errors. It’s helpful to know what backoff is working for you at a lower limit--thanks.

atljoseph commented 4 months ago

It’s working great. I’ll take a bit slower any day instead of 429s. Yeah we do some parsing similar to that at work regexp is great for this use case.

On Sun, May 19, 2024 at 1:01 PM Dane Schneider @.***> wrote:

@atljoseph https://github.com/atljoseph I decreased the backoff a bit in the last release so I may need to revert that or make it configurable. Or just parse the error message and wait accordingly as @cfieandres https://github.com/cfieandres suggested. My token limit is quite high from building/testing Plandex so I'm not getting any of these errors, so it's helpful to know what backoff is working for you at a lower limit--thanks.

— Reply to this email directly, view it on GitHub https://github.com/plandex-ai/plandex/issues/123#issuecomment-2119300668, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF632FOMUE6GMQ73LKCX4YLZDDLGZAVCNFSM6AAAAABH4343WWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGMYDANRWHA . You are receiving this because you were mentioned.Message ID: @.***>

atljoseph commented 4 months ago

Well, it does sometimes hit this error. After a few retried 429 logs in server is when i noticed it... immediately after the subsequent successful retry... No idea if these events are connected or not.

Getting listenStream - Stream chunk missing function call. in build line nums step. Using GPT-4o. Maybe a network thing? IDK, but it burnt a few dollars LOL (was a tall order i asked of it).

The, it leads to Could not find replacement in original file while viewing the changes, and then it exits. After that point, something bad goes wrong with my terminal, and i can't see anything that is typed (but when hitting enter, it sure executes). Have ran into this at least 5 times.

danenania commented 4 months ago

As of server/1.0.1 (already deployed on cloud), when hitting OpenAI rate limits, Plandex will now parse error messages that include a recommended wait time and automatically wait that long before retrying, up to 30 seconds

appreciated commented 4 months ago

@danenania Would you mind increasing the limit to 60 seconds? I hit the limit like 5 times today with ~45ish seconds. Or maybe allow to configure it?