hempels commented 1 year ago

All Open AI APIs potentially impose rate limiting. Ideally, any library designed to abstract the APIs should support exponential backoff.

https://beta.openai.com/docs/guides/production-best-practices/managing-rate-limits-and-latency

gotmike commented 1 year ago

@hempels -- do you have some code that would perform this task?

yungd1plomat commented 1 year ago

There are different limits for different users https://platform.openai.com/docs/guides/rate-limits/overview

pacarrier commented 1 year ago

I believe at the very least, the ApiResultBase should include usage as shown in https://platform.openai.com/docs/api-reference/completions/create. Knowing the number of token used can help set our own limits and prevent request that will be rejected.

pacarrier commented 1 year ago

Thank you for adding usage information! This helps a lot.

Baklap4 commented 1 year ago

Polly is a great library to help with exponential backoff and it plays nicely with the dotnet HTTPClient: https://github.com/App-vNext/Polly it has also functionality for Retrying, circuit braker and ratelimiting: https://github.com/App-vNext/Polly#resilience-policies

StefH commented 1 year ago

I get this exception when calling a embeddings.

Error at embeddings (https://api.openai.com/v1/embeddings) with HTTP status code: TooManyRequests. Content: {
    "error": {
        "message": "Rate limit reached for default-global-with-image-limits in organization org-*** on requests per min. Limit: 60 / min. Please try again in 1s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.",
        "type": "requests",
        "param": null,
        "code": null
    }
}

So implementing Polly would be great.

StefH commented 1 year ago

@hempels @gotmike @pacarrier @Baklap4 @yungd1plomat

I've created a small NuGet package which can be used to handle rate limits.

See https://www.nuget.org/packages/OpenAI.Polly/0.0.1-preview-01

OpenAI.Polly

Can be used to handle exceptions like:

Unhandled exception. System.Net.Http.HttpRequestException: Error at embeddings (https://api.openai.com/v1/embeddings) with HTTP status code: TooManyRequests. Content: {
    "error": {
        "message": "Rate limit reached for default-global-with-image-limits in organization org-*** on requests per min. Limit: 60 / min. Please try again in 1s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.",
        "type": "requests",
        "param": null,
        "code": null
    }
}

Polly

Polly is used to handle the TooManyRequests exceptions.

Usage

IOpenAIAPI openAiAPI = new OpenAIAPI();
float[] embeddings = await openAiAPI.WithRetry(api => api.Embeddings.GetEmbeddingsAsync("What is a cat?"));

Extension Methods

There are 3 extension methods that can be used to handle TooManyRequests Exceptions:

WithRetry which returns a Task<TResult>
WithRetry which returns a Task
WithRetry which returns nothing (void)

Ruud-cb commented 1 year ago

@StefH Does this also cover the ServiceUnavailable/Internal server error? Unfortunately the docs for a 503 say:

Cause: Our servers are experiencing high traffic. Solution: Please retry your requests after a brief wait.

Just occured to me, nothing was reported on their status page. Didn't keep sending request to see how long I had to wait.

StefH commented 1 year ago

@Ruud-cb When the error message contains Please retry your request, the internal logic should retry.

OkGoDoIt / OpenAI-API-dotnet

Support rate limiting via exponential backoff #28

OpenAI.Polly

Polly

Usage

Extension Methods