Open hempels opened 1 year ago
@hempels -- do you have some code that would perform this task?
There are different limits for different users https://platform.openai.com/docs/guides/rate-limits/overview
I believe at the very least, the ApiResultBase should include usage as shown in https://platform.openai.com/docs/api-reference/completions/create. Knowing the number of token used can help set our own limits and prevent request that will be rejected.
Thank you for adding usage information! This helps a lot.
Polly is a great library to help with exponential backoff and it plays nicely with the dotnet HTTPClient: https://github.com/App-vNext/Polly it has also functionality for Retrying, circuit braker and ratelimiting: https://github.com/App-vNext/Polly#resilience-policies
I get this exception when calling a embeddings.
Error at embeddings (https://api.openai.com/v1/embeddings) with HTTP status code: TooManyRequests. Content: {
"error": {
"message": "Rate limit reached for default-global-with-image-limits in organization org-*** on requests per min. Limit: 60 / min. Please try again in 1s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.",
"type": "requests",
"param": null,
"code": null
}
}
So implementing Polly would be great.
@hempels @gotmike @pacarrier @Baklap4 @yungd1plomat
I've created a small NuGet package which can be used to handle rate limits.
See https://www.nuget.org/packages/OpenAI.Polly/0.0.1-preview-01
Can be used to handle exceptions like:
Unhandled exception. System.Net.Http.HttpRequestException: Error at embeddings (https://api.openai.com/v1/embeddings) with HTTP status code: TooManyRequests. Content: {
"error": {
"message": "Rate limit reached for default-global-with-image-limits in organization org-*** on requests per min. Limit: 60 / min. Please try again in 1s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.",
"type": "requests",
"param": null,
"code": null
}
}
Polly is used to handle the TooManyRequests exceptions.
IOpenAIAPI openAiAPI = new OpenAIAPI();
float[] embeddings = await openAiAPI.WithRetry(api => api.Embeddings.GetEmbeddingsAsync("What is a cat?"));
There are 3 extension methods that can be used to handle TooManyRequests Exceptions:
WithRetry
which returns a Task<TResult>
WithRetry
which returns a Task
WithRetry
which returns nothing (void
)@StefH Does this also cover the ServiceUnavailable/Internal server error? Unfortunately the docs for a 503 say:
Cause: Our servers are experiencing high traffic. Solution: Please retry your requests after a brief wait.
Just occured to me, nothing was reported on their status page. Didn't keep sending request to see how long I had to wait.
@Ruud-cb
When the error message contains Please retry your request
, the internal logic should retry.
All Open AI APIs potentially impose rate limiting. Ideally, any library designed to abstract the APIs should support exponential backoff.
https://beta.openai.com/docs/guides/production-best-practices/managing-rate-limits-and-latency