microsoft / PubSec-Info-Assistant

Information Assistant, built with Azure OpenAI Service, Industry Accelerator
MIT License
312 stars 669 forks source link

Check for quota restrictions prior to any application deployment #816

Open iadelisle opened 1 month ago

iadelisle commented 1 month ago

Is your feature request related to a problem? Please describe. We have seen with several customers that the deployment process fails due to misconfigured quota, after they have already started the deployment, this leads to frustration with having to clean out the resource groups that have been deployed. We see this most when there are updates to the AOAI model selection (e.g. GPT4, vs GPT-3.5-turbo, without updating the default TPM setting). We are also seeing region-wide quota issues in some heavily used regions, is there a way to check for deploy-ability of services in the selected region?

Describe the solution you'd like Check for the quota restrictions across all of the services (especially AOAI) prior to deployment.

Describe alternatives you've considered We have instructed customers to ensure they have enough quota prior to deployment, but an automated solution to help assuage user frustration, would be appreciated.

KronemeyerJoshua commented 1 month ago

Terraform doesn't directly support quota checking for azure, but it is feasible to do this via Azure CLI and a bash script. You would need to use the cognitiveservices account command for AOAI https://learn.microsoft.com/en-us/cli/azure/cognitiveservices/account?view=azure-cli-latest

Here's an example of how you would extract that information using a bash script and should be a good starting point for whomever gets assigned to work on this.

#!/bin/bash

# Set variables
subscriptionId="<your_subscription_id>"
resourceGroupName="<your_resource_group>"
location="<your_location>"
openaiResourceName="<your_openai_resource_name>"

# Fetch the current quota usage
quota_usage=$(az cognitiveservices account list-usage \
  --name "$openaiResourceName" \
  --resource-group "$resourceGroupName" \
  --subscription "$subscriptionId" \
  --location "$location" \
  --output json)

# Extract relevant information
total_tpm=$(echo $quota_usage | jq '.[] | select(.name.value == "TotalCalls") | .currentValue')
quota_tpm=$(echo $quota_usage | jq '.[] | select(.name.value == "TotalCalls") | .limit')

# Display the results
echo "Current TPM Usage: $total_tpm"
echo "Quota TPM Limit: $quota_tpm"