microsoft / sample-app-aoai-chatGPT

Sample code for a simple web chat experience through Azure OpenAI, including Azure OpenAI On Your Data.
MIT License
1.54k stars 2.38k forks source link

GPT-4 1106-Preview often gives error ".. exceeded token rate limit of your current OpenAI S0 pricing tier". #406

Open marwic-norlys opened 9 months ago

marwic-norlys commented 9 months ago

Been using this for some time, all runs good. Just clean chat, no history etc.

After updating my application to use GPT-4 Turbo, users often get this error:

Requests to the Creates a completion for the chat message Operation under Azure OpenAI API version 2023-03-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 9 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

In my application settings I correctly put

AZURE_OPENAI_PREVIEW_API_VERSION=2023-07-01-preview

So wondering if this API version "2023-03-15-preview" you reference in the error message is hardcoded, or there is some other issue I dont see?

Btw im pretty sure the token rate limit is actually NOT exceeded, this happens also during evenings where employees are not using it.

dmbuk commented 7 months ago

Just wonder @marwic-norlys , what model and model name you've configured for the web app? I've deployed GPT turbo as gpt-4-turbo-1106 and put the same into both AZURE_OPENAI_MODEL and AZURE_OPENAI_MODEL_NAME. It works, and faster than on previous GPT4 model, but insists it does not know anything beyond Sep 2021.