Open anhashia opened 1 year ago
This is a quota error being returned by Open AI and is not related to a Bicep language issue. Please open a support ticket with Azure Open AI for more traction.
Sounds good ! Thanks for update! I will follow up with Azure Open AI team.
@anhashia
i'm the maintainer of LiteLLM we allow you to maximize your throughput/increase rate limits - load balance between multiple deployments (Azure, OpenAI) I believe litellm can be helpful here - and i'd love your feedback if we're missing something
Here's how to use it Docs: https://docs.litellm.ai/docs/routing
from litellm import Router
model_list = [{ # list of model deployments
"model_name": "gpt-3.5-turbo", # model alias
"litellm_params": { # params for litellm completion/embedding call
"model": "azure/chatgpt-v-2", # actual model name
"api_key": os.getenv("AZURE_API_KEY"),
"api_version": os.getenv("AZURE_API_VERSION"),
"api_base": os.getenv("AZURE_API_BASE")
}
}, {
"model_name": "gpt-3.5-turbo",
"litellm_params": { # params for litellm completion/embedding call
"model": "azure/chatgpt-functioncalling",
"api_key": os.getenv("AZURE_API_KEY"),
"api_version": os.getenv("AZURE_API_VERSION"),
"api_base": os.getenv("AZURE_API_BASE")
}
}, {
"model_name": "gpt-3.5-turbo",
"litellm_params": { # params for litellm completion/embedding call
"model": "vllm/TheBloke/Marcoroni-70B-v1-AWQ",
"api_key": os.getenv("OPENAI_API_KEY"),
}
}]
router = Router(model_list=model_list)
# openai.ChatCompletion.create replacement
response = router.completion(model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hey, how's it going?"}])
print(response)
Sounds good ! Thanks for update! I will follow up with Azure Open AI team.
@anhashia did you ever get an update for this? We've got the same problem, rerunning the same deployment fails.
Bicep version Bicep CLI version 0.21.1 (d4acbd2a9f)
Describe the bug If your rerun the provided sample script, it throws quota error even though there is no change in deployment script.
Error
To Reproduce Below is sample script called deploy.bicep that you can use for local reproduction. Replace it with your AOAI instance name and region
Run this sample script az deployment group create --resource-group NorthCentralUS --template-file ./deploy.bicep
Expected Result : Deployment should succeed without any errors
Actual Result :
Additional context sku: {
name: 'Standard' // SKU name
capacity: 155 // SKU capacity. This set TPM rate of deployment. }
Value in capacity is used to check for available quota without taking into account that it is an existing deployment and not a new capacity requirement.