Open molinch opened 3 weeks ago
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @ralph-msft @trrwilson.
Some additional context, where you can see:
The payload is identical between all these calls.
Thanks, @molinch -- great observations and data. I believe this is a combination of poor default retry interval handling and likely misconfiguration of authentication information on retries; I'll follow up.
I've noticed this too! Thanks for creating this bug, it makes sense. What did you look at to verify the underlying 429 status code in the responses?
@johnkord All HTTP requests were logged, so they were available in AppInsights logs
Library name and version
Azure.AI.OpenAI 2.0.0-beta.2
Describe the bug
We have a rather big prompt, and a small rate limit of 1000 tokens/minute. Due to that combination we can invoke the OpenAPI endpoint only once every minute.
If we exceed that we get back such an exception:
We investigated deeper and the truth is that at first there is a 429 (due to rate limits), then I assume the client retries, which results in a 401, and then this exception.
It leads to very wrong investigations, as you think you have a 401, when it's actually a rate limiting issue. Now the big question would be why upon retrying it becomes unauthorized.
Expected behavior
It should fail with a message indicating that the rate limits are exceeded (the 429). Then surrounding code might decide to retry with delay.
Actual behavior
It fails with a 401 unauthorized, which is misleading and cannot be properly handled by surrounding code. Typically you wouldn't retry on a 401.
Reproduction Steps
Have a rate limit of 1000 tokens per minute. Then have a fairly big prompt that eats most of it, and call the streaming chat function. If you call it a second time, within the same minute, then the issue appears.
We reproduce this issue both with WorkloadIdentityCredential when running from Kubernetes, and when running locally with AzureCliCredential. So it doesn't seem related to any token credential issue.
Environment
No response